WOSU’s primary source of funding comes from individual community support. With over 27,000 donors providing annual support equaling approximately $4 million with additional major and planned gift support, the development team seeks to develop a deeper understanding of our constituency.
Currently, the development teams work out of multiple databases and static documents with a treasure trove of disparate data about our donors.
From an analytics needs assessment, WOSU is at the base level of analytics and need to lay the foundation. The analysis below is a ground-work in teasing out the reasons of the facts we are aware or unaware regarding first time donors, and will help to be more focussed in the right direction in getting more relevant data and further utilize targeted fundraising approaches.
Compare the use of Bayesian and non-Bayesian statistical models in your Capstone Project:
1. Compare parameter estimates.
2. Extend MCMC output to investigate uncertainty in some policy decision related to your project.
a. Compute effect sizes.
b. Comment on posterior confidence intervals – are your results ‘significant’.
3. Explore alternative decision scenarios.
WOSU has a wide variety of fundraising programs through events, radio, TV , web, emails and personal contact. Which of these tasks contribute to increased giving? Which tasks detract from the fundraising success? What are the factors for their success?
If every year a certain level of first-time donors is sustained it will be cumulative income to the contribution by sustained donors for WOSU.
I will be exploring the following business scenarios for Ohio for Columbus OH(#22255) and other cities in Ohio (#42698) and Organizations:
There are 2 Datasets provided by WOSU:
1. 2000-2009 – 33740 observations (represents Original Donations), 22 variables/attributes
2. 2010-2018- 36361 observations(represents Original Donations), 22variables/attributes
#install.packages("tidyverse")
library(tidyverse)
#install.packages("sqldf")
library(sqldf)
#install.packages("lubridate")
library(lubridate)
#install.packages("dplyr")
library(dplyr)
#install.packages("plyr")
library(plyr)
#install.packages("readxl")
library(readxl)
#install.packages("ggplot2")
library(ggplot2)
library(scales)
library(stringr)
#install.packages("USAboundaries")
library(USAboundaries)
#install.packages("gender")
library(gender)
#install.packages("caret")
library(caret)
library(gridExtra)
#install.packages("VIM")
library(VIM)
library(lubridate)
library(grid)
#install.packages("XML")
library(XML)
#install.packages("httr")
library("httr")
#install.packages("VIM")
library(VIM)
library(bayesm)
setwd("~/Desktop/FISHER SMB-A/capstone/SMB-A Capstone Project")
#Original Gift Date-2000-2009
Gift_00_09 <- read_excel("2000-2009 Original Gift date .xlsx",
col_types = c("numeric", "text", "text",
"text", "text", "text", "text", "text",
"text", "text", "text", "date", "numeric",
"text", "text", "text", "text", "text",
"text", "numeric", "text", "text"))
#Original Gift Date 2010- 2018
Gift_10_18 <- read_excel("2010-2018 Original Gift date .xlsx", col_types = c("numeric", "text", "text",
"text", "text", "text", "text", "text",
"text", "text", "text", "date", "numeric",
"text", "text", "text", "text", "text",
"text", "numeric", "text", "text"))
#ORIGINAL GIFT FILES
#Rename the column names
GiftCol <- c("Account_ID", "Donor_Seq_Name", "Donor_Name", "Letter_Salutation", "Address1", "Address2", "City", "State", "Zipcode", "email_address", "Account_Status", "Original_Gift_Date", "Original_Gift_Amount", "Original_Gift_Source", "Original_Gift_Mode", "Solicitation_Type", "Solicitation_Method", "Orig_Gift_Has_Prm", "Orig_Gift_Pledge_Type", "Orig_Gift_Level_Amount", "PBS_Digital_Token", "Passport_Active_Date" )
colnames(Gift_00_09) <- GiftCol
colnames(Gift_10_18) <- GiftCol
sqldf("select sum(original_gift_amount) from Gift_10_18 where original_gift_source like '%PASSP%'")
## sum(original_gift_amount)
## 1 179303
#Separate the Zip code and Delivery Route
Gift_00_09_01 <- separate(Gift_00_09, Zipcode, c("Zip_code", "DeliveryRoute"))
Gift_10_18_01 <- separate(Gift_10_18, Zipcode, c("Zip_code", "DeliveryRoute"))
#Separate the date to Year and Month
Gift_00_09_02 <- separate(Gift_00_09_01, Original_Gift_Date, c("Gift_Year", "Gift_Month", "Gift_Day"), sep = "-", remove = FALSE)
Gift_10_18_02 <- separate(Gift_10_18_01, Original_Gift_Date, c("Gift_Year", "Gift_Month", "Gift_Day"), sep = "-", remove = FALSE)
Gift_00_09_02$Gift_Year <- as.numeric(Gift_00_09_02$Gift_Year)
Gift_00_09_02$Gift_Month <- as.numeric(Gift_00_09_02$Gift_Month)
Gift_00_09_02$Gift_Day <- as.numeric(Gift_00_09_02$Gift_Day)
Gift_10_18_02$Gift_Year <- as.numeric(Gift_10_18_02$Gift_Year)
Gift_10_18_02$Gift_Month <- as.numeric(Gift_10_18_02$Gift_Month)
Gift_10_18_02$Gift_Day <- as.numeric(Gift_10_18_02$Gift_Day)
#Keep observations only for period 2000-2009 and 2010-2018 in respective tables
Tbl1 <- sqldf(" select * from Gift_00_09_02 where gift_year > 2009 ")
Gift_10_18_03 <- rbind(Gift_10_18_02, Tbl1)
Gift_00_09_03 <- Gift_00_09_02 %>% filter(Gift_Year <= 2009 & Gift_Year >1999)
#changing to Date field
Gift_10_18_03$Original_Gift_Date <- as.Date(Gift_10_18_03$Original_Gift_Date)
Gift_00_09_03$Original_Gift_Date <- as.Date(Gift_00_09_03$Original_Gift_Date)
#removing redundant variables: Donor_Seq_Name, Address1, Address2, Zip_code, DeliveryRoute, Gift Day, Original_Gift_Pledge_Type, Original_Gift_Level_Amount, PBS_Digital_Token, Passport_Active_Date
Gift_00_09_04 <- Gift_00_09_03[, -c( 2, 5, 6, 9, 10, 16, 23, 24, 25, 26)]
Gift_10_18_04 <- Gift_10_18_03[, -c( 2, 5, 6, 9, 10, 16, 23, 24, 25, 26)]
#incorrect states
#state table
usstates <- us_states()
as_tibble(usstates)
## # A tibble: 52 x 13
## statefp statens affgeoid geoid stusps name lsad aland awater
## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl>
## 1 23 017797… 0400000… 23 ME Maine 00 7.99e10 1.17e10
## 2 15 017797… 0400000… 15 HI Hawa… 00 1.66e10 1.18e10
## 3 04 017797… 0400000… 04 AZ Ariz… 00 2.94e11 1.03e 9
## 4 05 000680… 0400000… 05 AR Arka… 00 1.35e11 2.96e 9
## 5 10 017797… 0400000… 10 DE Dela… 00 5.05e 9 1.40e 9
## 6 13 017053… 0400000… 13 GA Geor… 00 1.49e11 4.74e 9
## 7 27 006628… 0400000… 27 MN Minn… 00 2.06e11 1.89e10
## 8 02 017855… 0400000… 02 AK Alas… 00 1.48e12 2.45e11
## 9 06 017797… 0400000… 06 CA Cali… 00 4.04e11 2.05e10
## 10 11 017023… 0400000… 11 DC Dist… 00 1.58e 8 1.86e 7
## # … with 42 more rows, and 4 more variables: state_name <chr>,
## # state_abbr <chr>, jurisdiction_type <chr>, geometry <list>
uss <- usstates$stusps
usst <- as_tibble(uss)
not_us_states <- sqldf("select * from Gift_00_09_04 where State not in (select value from usst)")
not_us_states1 <- sqldf("select * from Gift_00_09_03 a, not_us_states b where a.Donor_Name = b.Donor_Name ")
not_us_states2 <- sqldf("select * from Gift_10_18_04 where State not in (select value from usst)")
not_us_states22 <- sqldf("select * from Gift_10_18_03 a, not_us_states b where a.Donor_Name = b.Donor_Name ")
#Donors Outside US
outside_US <- rbind(not_us_states, not_us_states2)
out_US <- sqldf("select Donor_Name, City, State, Account_Status, Gift_Year, Gift_Month, Original_Gift_Amount, Solicitation_Method from outside_US order by Account_Status ")
There are donors from Canada(from Province=Alberta, Ontario, and Quebec), London UK, Virgin Islands , Guam and from APO address. APO stands for “Army Post Office” and is associated with Army or Air Force installation.
#remove states from main files
Gift_00_09_05 <- sqldf("select * from Gift_00_09_04 where State in (select value from usst)")
Gift_10_18_05 <- sqldf("select * from Gift_10_18_04 where State in (select value from usst)")
#######
# identify the organization
Gift_00_09_05$Letter_Salutation <- tolower(Gift_00_09_05$Letter_Salutation)
Gift_10_18_05$Letter_Salutation <- tolower(Gift_10_18_05$Letter_Salutation)
Gift_09_Org <- Gift_00_09_05 %>% filter( str_detect(Letter_Salutation,"ampaig|oundat|company|group|corp|founda|organization| campaign|firm|establishment|agency|office|enterprise|operation|institution|venture|undertaking|practice|society|league|club|network|fund|process|consult|center|inc|publish|connect|team|plant|design|chamber|associa|person|community|system|service|vinyl|services|solution|oncrete|association|vcc|sisters|carryout|associates|photo%"))
Gift_18_Org <- Gift_10_18_05 %>% filter( str_detect(Letter_Salutation,"ampaig|oundat|compa|group|corp|founda|organization| campaign|firm|establishment|agency|office|enterprise|operation|institution|venture|undertaking|practice|society|league|club|network|fund|process|consult|center|inc|publish|connect|team|plant|design|chamber|associa|person|community|system|service|vinyl|services|solution|oncrete|association|vcc|sisters|carryout|associates|photo%"))
Gift_09_Org_01 <- sqldf("select * from Gift_09_Org where Letter_Salutation not like ('%mr%')")
Gift_09_Org_02 <- sqldf("select * from Gift_09_Org_01 where Letter_Salutation not like ('%ms%')")
Gift_09_Org_03 <- sqldf("select * from Gift_09_Org_02 where Letter_Salutation not like ('%dr%')")
Gift_09_Org_04 <- sqldf("select * from Gift_09_Org_03 where Letter_Salutation not like ('%vinc%')")
Gift_18_Org_01 <- sqldf("select * from Gift_18_Org where Letter_Salutation not like ('%mr%')")
Gift_18_Org_02 <- sqldf("select * from Gift_18_Org_01 where Letter_Salutation not like ('%ms%')")
Gift_18_Org_03 <- sqldf("select * from Gift_18_Org_02 where Letter_Salutation not like ('%dr%')")
Gift_18_Org_04 <- sqldf("select * from Gift_18_Org_03 where Letter_Salutation not like ('%vinc%')")
Gift_Org_Final <- rbind(Gift_09_Org_04, Gift_18_Org_04)
#Get Individuals file removing organizations
Gift_00_09_07 <- sqldf("select * from Gift_00_09_05 where Account_ID not in (select Account_ID from Gift_Org_Final)")
Gift_10_18_07 <- sqldf("select * from Gift_10_18_05 where Account_ID not in (select Account_ID from Gift_Org_Final)")
# get the gender field
Family_name_09 <- Gift_00_09_07 %>% filter( str_detect(Letter_Salutation,"&"))
Family_name_09_01 <- Family_name_09 %>% mutate(gender = NA)
Family_name_09_02 <- Family_name_09_01 %>% separate(Donor_Name, c("First", "Middle1", "Middle2", "Last"))
Family_name_10 <- Gift_10_18_07 %>% filter( str_detect(Letter_Salutation,"&"))
Family_name_10_01 <- Family_name_10 %>% mutate(gender = NA)
Family_name_10_02 <- Family_name_10_01 %>% separate(Donor_Name, c("First", "Middle1", "Middle2", "Last"))
Gen_ID_09 <- sqldf("select * from Gift_00_09_07 where account_id not in (select account_id from Family_name_09)")
Gen_ID_09$Donor_Name <- as.character(Gen_ID_09$Donor_Name)
Gen_ID_09_01 <- Gen_ID_09 %>% separate(Donor_Name, c("First", "Middle1", "Middle2", "Last"))
Gen_ID_09_02 <- Gen_ID_09_01 %>% mutate(min_year = 1940, max_year = 2000)
Gen_ID_09_03 <- Gen_ID_09_02 %>% gender_df(name_col = "First", year_col = c("min_year", "max_year"), method = "ssa")
Gen_ID_09_04 <- Gen_ID_09_02 %>% left_join(Gen_ID_09_03, by = c("First" = "name"))
Gen_ID_09_04_01 <- Gen_ID_09_04 %>% filter(is.na(Gen_ID_09_04$gender) == "FALSE")
Gen_ID_09_05 <- Gen_ID_09_04 %>% filter(is.na(Gen_ID_09_04$gender) == "TRUE")
Gen_ID_09_06 <- Gen_ID_09_05 %>% select(-c(22:26))
Gen_ID_09_07 <- Gen_ID_09_06 %>% gender_df(name_col = "Middle1", year_col = c("min_year", "max_year"), method = "ssa")
Gen_ID_09_08 <- Gen_ID_09_06 %>% left_join(Gen_ID_09_07, by = c("Middle1" = "name"))
Gen_ID_09_08_01 <- Gen_ID_09_08 %>% filter(is.na(gender) == "FALSE")
Gen_ID_09_08_02 <- Gen_ID_09_08 %>% filter(is.na(gender) == "TRUE")
Gen_ID_09_09 <- Gen_ID_09_08_02 %>% select(-c(22:26))
Gen_ID_09_10 <- Gen_ID_09_09 %>% gender_df(name_col = "Middle2", year_col = c("min_year", "max_year"), method = "ssa")
Gen_ID_09_11 <- Gen_ID_09_09 %>% left_join(Gen_ID_09_10, by = c("Middle2" = "name"))
Gen_ID_09_11_01 <- Gen_ID_09_11 %>% filter(is.na(gender) == "FALSE")
Gen_ID_09_11_02 <- Gen_ID_09_11 %>% filter(is.na(gender) == "TRUE")
Gen_ID_09_12 <- Gen_ID_09_11_02 %>% select(-c(22:26))
Gen_ID_09_13 <- Gen_ID_09_12 %>% gender_df(name_col = "Last", year_col = c("min_year", "max_year"), method = "ssa")
Gen_ID_09_14 <- Gen_ID_09_12 %>% left_join(Gen_ID_09_13, by = c("Last" = "name"))
Gen_ID_09_14_01 <- Gen_ID_09_14 %>% filter(is.na(gender) == "FALSE")
Gen_ID_09_14_02 <- Gen_ID_09_14 %>% filter(is.na(gender) == "TRUE")
#get gender from salutation
Gen_ID_09_15 <- Gen_ID_09_14_02 %>% mutate(gender = (ifelse(str_detect(Letter_Salutation, "mr"), "male", " ")))
Gen_ID_09_15_01 <- Gen_ID_09_15 %>% filter(gender == "male")
Gen_ID_09_15_02 <- Gen_ID_09_15 %>% mutate(gender = (ifelse(str_detect(Letter_Salutation, "ms"), "female", " ")))
Gen_ID_09_15_03 <- Gen_ID_09_15_02 %>% filter(gender == "female")
Gen_ID_09_15_04 <- rbind(Gen_ID_09_15_01, Gen_ID_09_15_03)
#remg gender not identified
y_01 <- sqldf("select * from Gen_ID_09_14_02 where account_id not in (select account_id from Gen_ID_09_15_04)")
#remove organization(Letter Salutation was different from Donor_names)
y_01$First <- tolower(y_01$First)
y_01$Middle1 <- tolower(y_01$Middle1)
y_01$Middle2 <- tolower(y_01$Middle2)
y_01$Last <- tolower(y_01$Last)
x_14_org_01 <- y_01 %>% filter( str_detect(Middle1,"ampaig|oundat|compa|group|corp|founda|organization| campaign|firm|establishment|agency|office|enterprise|operation|institution|venture|undertaking|practice|society|league|club|network|fund|process|consult|center|inc|publish|connect|team|plant|design|chamber|associa|person|community|system|service|vinyl|services|solution|oncrete|association|vcc|sisters|carryout|associates|photograp%"))
x_14_org_02 <- y_01 %>% filter( str_detect(Middle2,"ampaig|oundat|compa|group|corp|founda|organization| campaign|firm|establishment|agency|office|enterprise|operation|institution|venture|undertaking|practice|society|league|club|network|fund|process|consult|center|inc|publish|connect|team|plant|design|chamber|associa|person|community|system|service|vinyl|services|solution|oncrete|association|vcc|sisters|carryout|associates|photo|association|concrete%"))
x_14_org <- rbind(x_14_org_01, x_14_org_02 )
x_14_org_04 <- sqldf("select * from Gift_00_09_05 a, x_14_org b where a.account_id = b.account_id ")
x_14_org_05 <- x_14_org_04 %>% select(-c(17:42))
Gift_Org_Final_01 <- rbind(Gift_Org_Final, x_14_org_05)
#could not identify gender
x_15 <- sqldf("select * from y_01 where account_id not in (select account_id from x_14_org_05)")
x_16 <- x_15 %>% select(-c(20:23, 25:26))
#set the variable#
Gen_ID_09_04_03 <- Gen_ID_09_04_01 %>% select(-c(20:23), -c(25:26))
Gen_ID_09_08_03 <- Gen_ID_09_08_01 %>% select(-c(20:23), -c(25:26))
Gen_ID_09_11_03 <- Gen_ID_09_11_01 %>% select(-c(20:23), -c(25:26))
Gen_ID_09_14_03 <- Gen_ID_09_14_01 %>% select(-c(20:23), -c(25:26))
Gen_ID_09_15_05 <- Gen_ID_09_15_04 %>% select(-c(20:23), -c(25:26))
Gift_09_gender_final = rbind(Gen_ID_09_04_03, Gen_ID_09_08_03, Gen_ID_09_11_03, Gen_ID_09_14_03, Gen_ID_09_15_05, x_16, Family_name_09_02 )
###################################################################
#Repeat for Gift_18_Org
Gen_ID_18 <- sqldf("select * from Gift_10_18_07 where account_id not in (select account_id from Family_name_10)")
Gen_ID_18$Donor_Name <- as.character(Gen_ID_18$Donor_Name)
Gen_ID_18_01 <- Gen_ID_18 %>% separate(Donor_Name, c("First", "Middle1", "Middle2", "Last"))
Gen_ID_18_02 <- Gen_ID_18_01 %>% mutate(min_year = 1940, max_year = 2000)
Gen_ID_18_03 <- Gen_ID_18_02 %>% gender_df(name_col = "First", year_col = c("min_year", "max_year"), method = "ssa")
Gen_ID_18_04 <- Gen_ID_18_02 %>% left_join(Gen_ID_18_03, by = c("First" = "name"))
Gen_ID_18_04_01 <- Gen_ID_18_04 %>% filter(is.na(Gen_ID_18_04$gender) == "FALSE")
Gen_ID_18_05 <- Gen_ID_18_04 %>% filter(is.na(Gen_ID_18_04$gender) == "TRUE")
Gen_ID_18_06 <- Gen_ID_18_05 %>% select(-c(22:26))
Gen_ID_18_07 <- Gen_ID_18_06 %>% gender_df(name_col = "Middle1", year_col = c("min_year", "max_year"), method = "ssa")
Gen_ID_18_08 <- Gen_ID_18_06 %>% left_join(Gen_ID_18_07, by = c("Middle1" = "name"))
Gen_ID_18_08_01 <- Gen_ID_18_08 %>% filter(is.na(gender) == "FALSE")
Gen_ID_18_08_02 <- Gen_ID_18_08 %>% filter(is.na(gender) == "TRUE")
Gen_ID_18_09 <- Gen_ID_18_08_02 %>% select(-c(22:26))
Gen_ID_18_10 <- Gen_ID_18_09 %>% gender_df(name_col = "Middle2", year_col = c("min_year", "max_year"), method = "ssa")
Gen_ID_18_11 <- Gen_ID_18_09 %>% left_join(Gen_ID_18_10, by = c("Middle2" = "name"))
Gen_ID_18_11_01 <- Gen_ID_18_11 %>% filter(is.na(gender) == "FALSE")
Gen_ID_18_11_02 <- Gen_ID_18_11 %>% filter(is.na(gender) == "TRUE")
Gen_ID_18_12 <- Gen_ID_18_11_02 %>% select(-c(22:26))
Gen_ID_18_13 <- Gen_ID_18_12 %>% gender_df(name_col = "Last", year_col = c("min_year", "max_year"), method = "ssa")
Gen_ID_18_14 <- Gen_ID_18_12 %>% left_join(Gen_ID_18_13, by = c("Last" = "name"))
Gen_ID_18_14_01 <- Gen_ID_18_14 %>% filter(is.na(gender) == "FALSE")
Gen_ID_18_14_02 <- Gen_ID_18_14 %>% filter(is.na(gender) == "TRUE")
#get gender from salutation
Gen_ID_18_15 <- Gen_ID_18_14_02 %>% mutate(gender = (ifelse(str_detect(Letter_Salutation, "mr"), "male", " ")))
Gen_ID_18_15_01 <- Gen_ID_18_15 %>% filter(gender == "male")
Gen_ID_18_15_02 <- Gen_ID_18_15 %>% mutate(gender = (ifelse(str_detect(Letter_Salutation, "ms"), "female", " ")))
Gen_ID_18_15_03 <- Gen_ID_18_15_02 %>% filter(gender == "female")
Gen_ID_18_15_04 <- rbind(Gen_ID_18_15_01, Gen_ID_18_15_03)
y_11 <- sqldf("select * from Gen_ID_18_14_02 where account_id not in (select account_id from Gen_ID_18_15_04)")
#remove organization(Letter Salutation was different from Donor_names)
y_11$First <- tolower(y_11$First)
y_11$Middle1 <- tolower(y_11$Middle1)
y_11$Middle2 <- tolower(y_11$Middle2)
y_11$Last <- tolower(y_11$Last)
x_24_org_01 <- y_11 %>% filter( str_detect(Middle1,"ampaig|oundat|compa|group|corp|founda|organization| campaign|firm|establishment|agency|office|enterprise|operation|institution|venture|undertaking|practice|society|league|club|network|fund|process|consult|center|inc|publish|connect|team|plant|design|chamber|associa|person|community|system|service|vinyl|services|solution|oncrete|association|vcc|sisters|carryout|associates|photo|school|computer|therapy|propertie|dept|product|program|builder|chevrolet|medical|market|partners|member|blooms%"))
x_24_org_02 <- y_11 %>% filter( str_detect(Middle2,"ampaig|oundat|compa|group|corp|founda|organization| campaign|firm|establishment|agency|office|enterprise|operation|institution|venture|undertaking|practice|society|league|club|network|fund|process|consult|center|inc|publish|connect|team|plant|design|chamber|associa|person|community|system|service|vinyl|services|solution|oncrete|association|vcc|sisters|carryout|associates|photo|school|computer|therapy|auto|propertie|dept|product|program|builder|chevrolet|medical|market|partners|member%"))
x_24_org_03 <- rbind(x_24_org_01 , x_24_org_02 )
x_24_org_04 <- sqldf("select * from Gift_10_18_05 a, x_24_org_03 b where a.account_id = b.account_id ")
x_24_org_05 <- x_24_org_04 %>% select(-c(17:42))
#Final Organization file
Gift_Org_Final_02 <- rbind(Gift_Org_Final_01, x_24_org_05)
x_25 <- sqldf("select * from y_11 where account_id not in (select account_id from x_24_org_05)")
#could not filter gender
x_26 <- x_25 %>% select(-c(20:23, 25:26))
#set the variable#
Gen_ID_18_04_03 <- Gen_ID_18_04_01 %>% select(-c(20:23), -c(25:26))
Gen_ID_18_08_03 <- Gen_ID_18_08_01 %>% select(-c(20:23), -c(25:26))
Gen_ID_18_11_03 <- Gen_ID_18_11_01 %>% select(-c(20:23), -c(25:26))
#Gen_ID_18_14_03 <- Gen_ID_18_14_01 %>% select(-c(20:23), -c(25:26))
Gen_ID_18_15_05 <- Gen_ID_18_15_04 %>% select(-c(20:23), -c(25:26))
Gift_18_gender_final = rbind(Gen_ID_18_04_03, Gen_ID_18_08_03, Gen_ID_18_11_03, Gen_ID_18_15_05, x_26, Family_name_10_02 )
####################################################################
# separate file for OH State
Gift_OH_09 <-Gift_09_gender_final %>% filter(State == "OH")
Gift_OH_18 <- Gift_18_gender_final %>% filter(State == "OH")
#Rest of the states in US
Gift_00_09_08 <- sqldf("select * from Gift_09_gender_final where State not in ('OH')")
Gift_10_18_08 <- sqldf("select * from Gift_18_gender_final where State not in ('OH')")
#Original Gift Source to be condensed to root words
#In Ohio file
Gift_OH_09$Original_Gift_Source <- tolower(Gift_OH_09$Original_Gift_Source)
#radio/fm
Gift_OH_09_radio <- Gift_OH_09 %>% filter(str_detect(Original_Gift_Source, "fm|radio|am|air|renewal|adjust"))
radio <- "radio"
Gift_OH_09_radio$Original_Gift_Source <- radio
#tv
Gift_OH_09_remg <- sqldf("select * from Gift_OH_09 where account_id not in (select account_id from Gift_OH_09_radio ) ")
Gift_OH_09_tv <- Gift_OH_09_remg %>% filter(str_detect(Original_Gift_Source, "tv|toop"))
tv <- "tv"
Gift_OH_09_tv$Original_Gift_Source <- tv
#web
Gift_OH_09_remg_01 <- sqldf("select * from Gift_OH_09_remg where account_id not in (select account_id from Gift_OH_09_tv) ")
Gift_OH_09_web <- Gift_OH_09_remg_01 %>% filter(str_detect(Original_Gift_Source, "internet|web"))
web <- "web"
Gift_OH_09_web$Original_Gift_Source <- web
#New year
Gift_OH_09_remg_02 <- sqldf("select * from Gift_OH_09_remg_01 where account_id not in (select account_id from Gift_OH_09_web) ")
Gift_OH_09_newyear <- Gift_OH_09_remg_02 %>% filter(str_detect(Original_Gift_Source, "happy"))
new_year <- "new year"
Gift_OH_09_newyear$Original_Gift_Source <- new_year
#membership drive
Gift_OH_09_remg_03 <- sqldf("select * from Gift_OH_09_remg_02 where account_id not in (select account_id from Gift_OH_09_newyear) ")
Gift_OH_09_mbr <- Gift_OH_09_remg_03 %>% filter(str_detect(Original_Gift_Source, "mbrship|member|pledge"))
mbr <- "membership drive"
Gift_OH_09_mbr$Original_Gift_Source <- radio
#acquisition
Gift_OH_09_remg_04 <- sqldf("select * from Gift_OH_09_remg_03 where account_id not in (select account_id from Gift_OH_09_mbr) ")
Gift_OH_09_acq <- Gift_OH_09_remg_04 %>% filter(str_detect(Original_Gift_Source, "acq|acqui"))
acq <- "acquisition"
Gift_OH_09_acq$Original_Gift_Source <- acq
#annual fund
Gift_OH_09_remg_05 <- sqldf("select * from Gift_OH_09_remg_04 where account_id not in (select account_id from Gift_OH_09_acq) ")
Gift_OH_09_af <- Gift_OH_09_remg_05 %>% filter(str_detect(Original_Gift_Source, "fund|annual"))
anfund <- "annual fund"
Gift_OH_09_af$Original_Gift_Source <- anfund
#cba
Gift_OH_09_remg_06 <- sqldf("select * from Gift_OH_09_remg_05 where account_id not in (select account_id from Gift_OH_09_af) ")
Gift_OH_09_cba <- Gift_OH_09_remg_06 %>% filter(str_detect(Original_Gift_Source, "cba"))
mail <- "mail"
Gift_OH_09_cba$Original_Gift_Source <- mail
#ot white
Gift_OH_09_remg_07 <- sqldf("select * from Gift_OH_09_remg_06 where account_id not in (select account_id from Gift_OH_09_cba) ")
Gift_OH_09_otw<- Gift_OH_09_remg_07 %>% filter(str_detect(Original_Gift_Source, "white"))
otw <- "otwhite"
Gift_OH_09_otw$Original_Gift_Source <- otw
#mail
Gift_OH_09_remg_08 <- sqldf("select * from Gift_OH_09_remg_07 where account_id not in (select account_id from Gift_OH_09_otw) ")
Gift_OH_09_mail <- Gift_OH_09_remg_08 %>% filter(str_detect(Original_Gift_Source, "mail"))
mail <- "mail"
Gift_OH_09_mail$Original_Gift_Source <- mail
#year end
Gift_OH_09_remg_09 <- sqldf("select * from Gift_OH_09_remg_08 where account_id not in (select account_id from Gift_OH_09_mail) ")
Gift_OH_09_yrend <- Gift_OH_09_remg_09 %>% filter(str_detect(Original_Gift_Source, "year end"))
yrend <- "year end"
Gift_OH_09_yrend$Original_Gift_Source <- yrend
#showcase
Gift_OH_09_remg_10 <- sqldf("select * from Gift_OH_09_remg_09 where account_id not in (select account_id from Gift_OH_09_yrend) ")
Gift_OH_09_shcase<- Gift_OH_09_remg_10 %>% filter(str_detect(Original_Gift_Source, "showcase"))
showcase <- "showcase"
Gift_OH_09_shcase$Original_Gift_Source <- showcase
#tahnksgiving
Gift_OH_09_remg_11 <- sqldf("select * from Gift_OH_09_remg_10 where account_id not in (select account_id from Gift_OH_09_shcase) ")
Gift_OH_09_thk <- Gift_OH_09_remg_11 %>% filter(str_detect(Original_Gift_Source, "thanksgiv"))
thk <- "thanksgiving"
Gift_OH_09_thk$Original_Gift_Source <- radio
#special opportunity
Gift_OH_09_remg_12 <- sqldf("select * from Gift_OH_09_remg_11 where account_id not in (select account_id from Gift_OH_09_thk) ")
Gift_OH_09_so <- Gift_OH_09_remg_12 %>% filter(str_detect(Original_Gift_Source, "oppo|opp|special|auction|classic|gift|car|celtic|war|christian|memory|laps|dmw"))
opportunity <- "special opportunity"
Gift_OH_09_so$Original_Gift_Source <- opportunity
#On Air -radio- remaining
Gift_OH_09_remg_13 <- sqldf("select * from Gift_OH_09_remg_12 where account_id not in (select account_id from Gift_OH_09_so) ")
Gift_OH_09_onair_rmg <- sqldf("select * from Gift_OH_09_remg_13 where solicitation_method like '%On Air%'")
Gift_OH_09_onair_rmg$Original_Gift_Source <- radio
#Mail - remaining
Gift_OH_09_mail_rmg <- sqldf("select * from Gift_OH_09_remg_13 where solicitation_method like '%ail%'")
Gift_OH_09_mail_rmg$Original_Gift_Source <- mail
#Gift source with root word
Gift_OH_09_source_root <- rbind(Gift_OH_09_radio, Gift_OH_09_tv, Gift_OH_09_web, Gift_OH_09_newyear, Gift_OH_09_mbr, Gift_OH_09_acq, Gift_OH_09_af, Gift_OH_09_cba, Gift_OH_09_otw, Gift_OH_09_mail, Gift_OH_09_yrend, Gift_OH_09_shcase, Gift_OH_09_thk, Gift_OH_09_so, Gift_OH_09_onair_rmg, Gift_OH_09_mail_rmg)
########
Gift_10_18_08$Original_Gift_Source <- tolower(Gift_10_18_08$Original_Gift_Source)
#radio/fm
Gift_10_18_08_radio <- Gift_10_18_08 %>% filter(str_detect(Original_Gift_Source, "fm|radio|am|air|renewal|adjust"))
radio <- "radio"
Gift_10_18_08_radio$Original_Gift_Source <- radio
#tv
Gift_10_18_08_remg <- sqldf("select * from Gift_10_18_08 where account_id not in (select account_id from Gift_10_18_08_radio ) ")
Gift_10_18_08_tv <- Gift_10_18_08_remg %>% filter(str_detect(Original_Gift_Source, "tv|toop"))
tv <- "tv"
Gift_10_18_08_tv$Original_Gift_Source <- tv
#web
Gift_10_18_08_remg_01 <- sqldf("select * from Gift_10_18_08_remg where account_id not in (select account_id from Gift_10_18_08_tv) ")
Gift_10_18_08_web <- Gift_10_18_08_remg_01 %>% filter(str_detect(Original_Gift_Source, "internet|web"))
web <- "web"
Gift_10_18_08_web$Original_Gift_Source <- web
#New year
Gift_10_18_08_remg_02 <- sqldf("select * from Gift_10_18_08_remg_01 where account_id not in (select account_id from Gift_10_18_08_web) ")
Gift_10_18_08_newyear <- Gift_10_18_08_remg_02 %>% filter(str_detect(Original_Gift_Source, "happy"))
new_year <- "new year"
Gift_10_18_08_newyear$Original_Gift_Source <- new_year
#acquisition
Gift_10_18_08_remg_03 <- sqldf("select * from Gift_10_18_08_remg_02 where account_id not in (select account_id from Gift_10_18_08_newyear) ")
Gift_10_18_08_acq <- Gift_10_18_08_remg_03 %>% filter(str_detect(Original_Gift_Source, "acq|acqui"))
acq <- "acquisition"
Gift_10_18_08_acq$Original_Gift_Source <- acq
#annual fund
Gift_10_18_08_remg_04 <- sqldf("select * from Gift_10_18_08_remg_03 where account_id not in (select account_id from Gift_10_18_08_acq) ")
Gift_10_18_08_af <- Gift_10_18_08_remg_04 %>% filter(str_detect(Original_Gift_Source, "fund|annual"))
anfund <- "annual fund"
Gift_10_18_08_af$Original_Gift_Source <- anfund
#ot white
Gift_10_18_08_remg_05 <- sqldf("select * from Gift_10_18_08_remg_04 where account_id not in (select account_id from Gift_10_18_08_af) ")
Gift_10_18_08_otw<- Gift_10_18_08_remg_05 %>% filter(str_detect(Original_Gift_Source, "white"))
otw <- "otwhite"
Gift_10_18_08_otw$Original_Gift_Source <- otw
#year end
Gift_10_18_08_remg_06 <- sqldf("select * from Gift_10_18_08_remg_05 where account_id not in (select account_id from Gift_10_18_08_otw) ")
Gift_10_18_08_yrend <- Gift_10_18_08_remg_06 %>% filter(str_detect(Original_Gift_Source, "year end"))
yrend <- "year end"
Gift_10_18_08_yrend$Original_Gift_Source <- yrend
#passport
Gift_10_18_08_remg_07 <- sqldf("select * from Gift_10_18_08_remg_06 where account_id not in (select account_id from Gift_10_18_08_yrend) ")
Gift_10_18_08_passport <- Gift_10_18_08_remg_07 %>% filter(str_detect(Original_Gift_Source, "passport"))
passport <- "passport"
Gift_10_18_08_passport$Original_Gift_Source <- passport
#special opportunity
Gift_10_18_08_remg_08 <- sqldf("select * from Gift_10_18_08_remg_07 where account_id not in (select account_id from Gift_10_18_08_passport) ")
Gift_10_18_08_so <- Gift_10_18_08_remg_08 %>% filter(str_detect(Original_Gift_Source, "oppo|opp|special|auction|classic|gift|car|celtic|war|christian|memory|laps|dmw|match|pitch|give|sponser|founder|ticket"))
opportunity <- "special opportunity"
Gift_10_18_08_so$Original_Gift_Source <- opportunity
#Mail - remaining
Gift_10_18_08_remg_09 <- sqldf("select * from Gift_10_18_08_remg_08 where account_id not in (select account_id from Gift_10_18_08_so) ")
Gift_10_18_08_mail_rmg <- sqldf("select * from Gift_10_18_08_remg_09 where solicitation_method like '%ail%'")
Gift_10_18_08_mail_rmg$Original_Gift_Source <- mail
#On air remg
Gift_10_18_08_remg_10 <- sqldf("select * from Gift_10_18_08_remg_09 where account_id not in (select account_id from Gift_10_18_08_mail_rmg) ")
Gift_10_18_08_remg_10$Original_Gift_Source <- radio
Gift_10_18_08_source_root <- rbind(Gift_10_18_08_radio, Gift_10_18_08_tv, Gift_10_18_08_web, Gift_10_18_08_newyear, Gift_10_18_08_acq, Gift_10_18_08_af, Gift_10_18_08_otw, Gift_10_18_08_yrend, Gift_10_18_08_passport, Gift_10_18_08_so, Gift_10_18_08_mail_rmg, Gift_10_18_08_remg_10)
########################
Gift_OH_18$Original_Gift_Source <- tolower(Gift_OH_18$Original_Gift_Source)
#radio/fm
Gift_OH_18_radio <- Gift_OH_18 %>% filter(str_detect(Original_Gift_Source, "fm|radio|am|air|renewal|adjust"))
radio <- "radio"
Gift_OH_18_radio$Original_Gift_Source <- radio
#tv
Gift_OH_18_remg <- sqldf("select * from Gift_OH_18 where account_id not in (select account_id from Gift_OH_18_radio ) ")
Gift_OH_18_tv <- Gift_OH_18_remg %>% filter(str_detect(Original_Gift_Source, "tv|toop"))
tv <- "tv"
Gift_OH_18_tv$Original_Gift_Source <- tv
#web
Gift_OH_18_remg_01 <- sqldf("select * from Gift_OH_18_remg where account_id not in (select account_id from Gift_OH_18_tv) ")
Gift_OH_18_web <- Gift_OH_18_remg_01 %>% filter(str_detect(Original_Gift_Source, "internet|web"))
web <- "web"
Gift_OH_18_web$Original_Gift_Source <- web
#New year
Gift_OH_18_remg_02 <- sqldf("select * from Gift_OH_18_remg_01 where account_id not in (select account_id from Gift_OH_18_web) ")
Gift_OH_18_newyear <- Gift_OH_18_remg_02 %>% filter(str_detect(Original_Gift_Source, "happy"))
new_year <- "new year"
Gift_OH_18_newyear$Original_Gift_Source <- new_year
#membership drive
Gift_OH_18_remg_03 <- sqldf("select * from Gift_OH_18_remg_02 where account_id not in (select account_id from Gift_OH_18_newyear) ")
Gift_OH_18_mbr <- Gift_OH_18_remg_03 %>% filter(str_detect(Original_Gift_Source, "mbrship|member|pledge"))
mbr <- "membership drive"
Gift_OH_18_mbr$Original_Gift_Source <- mail
#acquisition
Gift_OH_18_remg_04 <- sqldf("select * from Gift_OH_18_remg_03 where account_id not in (select account_id from Gift_OH_18_mbr) ")
Gift_OH_18_acq <- Gift_OH_18_remg_04 %>% filter(str_detect(Original_Gift_Source, "acq|acqui"))
acq <- "acquisition"
Gift_OH_18_acq$Original_Gift_Source <- acq
#annual fund
Gift_OH_18_remg_05 <- sqldf("select * from Gift_OH_18_remg_04 where account_id not in (select account_id from Gift_OH_18_acq) ")
Gift_OH_18_af <- Gift_OH_18_remg_05 %>% filter(str_detect(Original_Gift_Source, "fund|annual"))
anfund <- "annual fund"
Gift_OH_18_af$Original_Gift_Source <- anfund
#cba
Gift_OH_18_remg_06 <- sqldf("select * from Gift_OH_18_remg_05 where account_id not in (select account_id from Gift_OH_18_af) ")
Gift_OH_18_cba <- Gift_OH_18_remg_06 %>% filter(str_detect(Original_Gift_Source, "cba"))
cba <- "cba"
Gift_OH_18_cba$Original_Gift_Source <- acq
#ot white
Gift_OH_18_remg_07 <- sqldf("select * from Gift_OH_18_remg_06 where account_id not in (select account_id from Gift_OH_18_cba) ")
Gift_OH_18_otw<- Gift_OH_18_remg_07 %>% filter(str_detect(Original_Gift_Source, "white"))
otw <- "otwhite"
Gift_OH_18_otw$Original_Gift_Source <- otw
#mail
Gift_OH_18_remg_08 <- sqldf("select * from Gift_OH_18_remg_07 where account_id not in (select account_id from Gift_OH_18_otw) ")
Gift_OH_18_mail <- Gift_OH_18_remg_08 %>% filter(str_detect(Original_Gift_Source, "mail"))
mail <- "mail"
Gift_OH_18_mail$Original_Gift_Source <- mail
#year end
Gift_OH_18_remg_09 <- sqldf("select * from Gift_OH_18_remg_08 where account_id not in (select account_id from Gift_OH_18_mail) ")
Gift_OH_18_yrend <- Gift_OH_18_remg_09 %>% filter(str_detect(Original_Gift_Source, "year end"))
yrend <- "year end"
Gift_OH_18_yrend$Original_Gift_Source <- yrend
#passport
Gift_OH_18_remg_10 <- sqldf("select * from Gift_OH_18_remg_09 where account_id not in (select account_id from Gift_OH_18_yrend) ")
Gift_OH_18_passport <- Gift_OH_18_remg_10 %>% filter(str_detect(Original_Gift_Source, "passport"))
passport <- "passport"
Gift_OH_18_passport$Original_Gift_Source <- passport
#open ask mail
Gift_OH_18_remg_11 <- sqldf("select * from Gift_OH_18_remg_10 where account_id not in (select account_id from Gift_OH_18_passport) ")
Gift_OH_18_oa <- Gift_OH_18_remg_11 %>% filter(str_detect(Original_Gift_Source, "open ask"))
Gift_OH_18_oa$Original_Gift_Source <- mail
#special opportunity
Gift_OH_18_remg_12 <- sqldf("select * from Gift_OH_18_remg_11 where account_id not in (select account_id from Gift_OH_18_oa) ")
Gift_OH_18_so <- Gift_OH_18_remg_12 %>% filter(str_detect(Original_Gift_Source, "oppo|opp|special|auction|classic|gift|car|celtic|war|christian|memory|laps|dmw|match|pitch|give|sponser|founder"))
opportunity <- "special opportunity"
Gift_OH_18_so$Original_Gift_Source <- opportunity
#On Air -radio- remaining
Gift_OH_18_remg_13 <- sqldf("select * from Gift_OH_18_remg_12 where account_id not in (select account_id from Gift_OH_18_so) ")
Gift_OH_18_onair_rmg <- sqldf("select * from Gift_OH_18_remg_13 where solicitation_method like '%On Air%'")
Gift_OH_18_onair_rmg$Original_Gift_Source <- radio
#Mail - remaining
Gift_OH_18_mail_rmg <- sqldf("select * from Gift_OH_18_remg_13 where solicitation_method like '%ail%'")
Gift_OH_18_mail_rmg$Original_Gift_Source <- mail
#Web - remaining
Gift_OH_18_web_rmg <- sqldf("select * from Gift_OH_18_remg_13 where solicitation_method like '%eb%'")
Gift_OH_18_web_rmg$Original_Gift_Source <- web
#- remaining
Gift_OH_18_ot_rmg <- sqldf("select * from Gift_OH_18_remg_13 where account_id not in (select account_id from Gift_OH_18_web_rmg )")
Gift_OH_18_ot1_rmg <- sqldf("select * from Gift_OH_18_ot_rmg where account_id not in (select account_id from Gift_OH_18_mail_rmg )")
Gift_OH_18_ot2_rmg <- sqldf("select * from Gift_OH_18_ot1_rmg where account_id not in (select account_id from Gift_OH_18_onair_rmg )")
Gift_OH_18_ot2_rmg$Original_Gift_Source <- radio
#Gift source with root word
Gift_OH_18_source_root <- rbind(Gift_OH_18_radio, Gift_OH_18_tv, Gift_OH_18_web, Gift_OH_18_newyear, Gift_OH_18_mbr, Gift_OH_18_acq, Gift_OH_18_af, Gift_OH_18_cba, Gift_OH_18_otw, Gift_OH_18_mail, Gift_OH_18_yrend, Gift_OH_18_passport, Gift_OH_18_oa, Gift_OH_18_so, Gift_OH_18_onair_rmg, Gift_OH_18_mail_rmg, Gift_OH_18_web_rmg, Gift_OH_18_ot2_rmg)
#############
# In organization file
Gift_Org_Final_02$Original_Gift_Source <- tolower(Gift_Org_Final_02$Original_Gift_Source)
#radio/fm
Gift_Org_Final_02_radio <- Gift_Org_Final_02 %>% filter(str_detect(Original_Gift_Source, "fm|radio|am|air|renewal|adjust"))
radio <- "radio"
Gift_Org_Final_02_radio$Original_Gift_Source <- radio
#tv
Gift_Org_Final_02_remg <- sqldf("select * from Gift_Org_Final_02 where account_id not in (select account_id from Gift_Org_Final_02_radio ) ")
Gift_Org_Final_02_tv <- Gift_Org_Final_02_remg %>% filter(str_detect(Original_Gift_Source, "tv|toop"))
tv <- "tv"
Gift_Org_Final_02_tv$Original_Gift_Source <- tv
#web
Gift_Org_Final_02_remg_01 <- sqldf("select * from Gift_Org_Final_02_remg where account_id not in (select account_id from Gift_Org_Final_02_tv) ")
Gift_Org_Final_02_web <- Gift_Org_Final_02_remg_01 %>% filter(str_detect(Original_Gift_Source, "internet|web"))
web <- "web"
Gift_Org_Final_02_web$Original_Gift_Source <- web
#acquisition
Gift_Org_Final_02_remg_04 <- sqldf("select * from Gift_Org_Final_02_remg_01 where account_id not in (select account_id from Gift_Org_Final_02_web) ")
Gift_Org_Final_02_acq <- Gift_Org_Final_02_remg_04 %>% filter(str_detect(Original_Gift_Source, "acq|acqui"))
acq <- "acquisition"
Gift_Org_Final_02_acq$Original_Gift_Source <- acq
#annual fund
Gift_Org_Final_02_remg_05 <- sqldf("select * from Gift_Org_Final_02_remg_04 where account_id not in (select account_id from Gift_Org_Final_02_acq) ")
Gift_Org_Final_02_af <- Gift_Org_Final_02_remg_05 %>% filter(str_detect(Original_Gift_Source, "fund|annual"))
anfund <- "annual fund"
Gift_Org_Final_02_af$Original_Gift_Source <- anfund
#special opportunity
Gift_Org_Final_02_remg_12 <- sqldf("select * from Gift_Org_Final_02_remg_05 where account_id not in (select account_id from Gift_Org_Final_02_af) ")
Gift_Org_Final_02_so <- Gift_Org_Final_02_remg_12 %>% filter(str_detect(Original_Gift_Source, "oppo|opp|special|auction|classic|gift|car|celtic|war|christian|memory|laps|dmw|match|pitch|give|sponser|founder"))
opportunity <- "special opportunity"
Gift_Org_Final_02_so$Original_Gift_Source <- opportunity
#ot white
Gift_Org_Final_02_remg_13 <- sqldf("select * from Gift_Org_Final_02_remg_12 where account_id not in (select account_id from Gift_Org_Final_02_so) ")
Gift_Org_Final_02_otw<- Gift_Org_Final_02_remg_13 %>% filter(str_detect(Original_Gift_Source, "white"))
otw <- "otwhite"
Gift_Org_Final_02_otw$Original_Gift_Source <- otw
#Web - remaining
Gift_Org_Final_02_web_rmg <- sqldf("select * from Gift_Org_Final_02_remg_13 where account_id not in (select account_id from Gift_Org_Final_02_otw)")
Gift_Org_Final_02_web_rmg_01 <- sqldf("select * from Gift_Org_Final_02_web_rmg where solicitation_method like '%eb%'")
Gift_Org_Final_02_web_rmg_01$Original_Gift_Source <- web
#radio - remaining
Gift_Org_Final_02_radio_rmg <- sqldf("select * from Gift_Org_Final_02_web_rmg where account_id not in (select account_id from Gift_Org_Final_02_web_rmg_01)")
Gift_Org_Final_02_radio_rmg$Original_Gift_Source <- radio
#Gift source with root word
Gift_Org_Final_source_root <- rbind(Gift_Org_Final_02_radio, Gift_Org_Final_02_tv, Gift_Org_Final_02_web,
Gift_Org_Final_02_acq, Gift_Org_Final_02_af, Gift_Org_Final_02_so, Gift_Org_Final_02_otw, Gift_Org_Final_02_web_rmg_01, Gift_Org_Final_02_radio_rmg )
#################
# In states other than Ohio
Gift_00_09_08$Original_Gift_Source <- tolower(Gift_00_09_08$Original_Gift_Source)
#radio/fm
Gift_00_09_08_radio <- Gift_00_09_08 %>% filter(str_detect(Original_Gift_Source, "fm|radio|am|air|renewal|adjust"))
radio <- "radio"
Gift_00_09_08_radio$Original_Gift_Source <- radio
#tv
Gift_00_09_08_remg <- sqldf("select * from Gift_00_09_08 where account_id not in (select account_id from Gift_00_09_08_radio ) ")
Gift_00_09_08_tv <- Gift_00_09_08_remg %>% filter(str_detect(Original_Gift_Source, "tv|toop"))
tv <- "tv"
Gift_00_09_08_tv$Original_Gift_Source <- tv
Gift_00_09_08_remg_01 <- sqldf("select * from Gift_00_09_08_remg where account_id not in (select account_id from Gift_00_09_08_tv) ")
Gift_00_09_08_web <- Gift_00_09_08_remg_01 %>% filter(str_detect(Original_Gift_Source, "internet|web"))
web <- "web"
Gift_00_09_08_web$Original_Gift_Source <- web
#New year
Gift_00_09_08_remg_02 <- sqldf("select * from Gift_00_09_08_remg_01 where account_id not in (select account_id from Gift_00_09_08_web) ")
Gift_00_09_08_newyear <- Gift_00_09_08_remg_02 %>% filter(str_detect(Original_Gift_Source, "happy"))
new_year <- "new year"
Gift_00_09_08_newyear$Original_Gift_Source <- new_year
#acquissition
Gift_00_09_08_remg_03 <- sqldf("select * from Gift_00_09_08_remg_02 where account_id not in (select account_id from Gift_00_09_08_newyear) ")
Gift_00_09_08_acq <- Gift_00_09_08_remg_03 %>% filter(str_detect(Original_Gift_Source, "acq|acqui"))
acq <- "acquisition"
Gift_00_09_08_acq$Original_Gift_Source <- acq
#annual fund
Gift_00_09_08_remg_04 <- sqldf("select * from Gift_00_09_08_remg_03 where account_id not in (select account_id from Gift_00_09_08_acq) ")
Gift_00_09_08_af <- Gift_00_09_08_remg_04 %>% filter(str_detect(Original_Gift_Source, "fund|annual"))
anfund <- "annual fund"
Gift_00_09_08_af$Original_Gift_Source <- anfund
#ot white
Gift_00_09_08_remg_05 <- sqldf("select * from Gift_00_09_08_remg_04 where account_id not in (select account_id from Gift_00_09_08_af) ")
Gift_00_09_08_otw<- Gift_00_09_08_remg_05 %>% filter(str_detect(Original_Gift_Source, "white"))
otw <- "otwhite"
Gift_00_09_08_otw$Original_Gift_Source <- otw
#mail
Gift_00_09_08_remg_06 <- sqldf("select * from Gift_00_09_08_remg_05 where account_id not in (select account_id from Gift_00_09_08_otw) ")
Gift_00_09_08_mail <- Gift_00_09_08_remg_06 %>% filter(str_detect(Original_Gift_Source, "mail"))
mail <- "mail"
Gift_00_09_08_mail$Original_Gift_Source <- mail
#year end
Gift_00_09_08_remg_07 <- sqldf("select * from Gift_00_09_08_remg_06 where account_id not in (select account_id from Gift_00_09_08_mail) ")
Gift_00_09_08_yrend <- Gift_00_09_08_remg_07 %>% filter(str_detect(Original_Gift_Source, "year end"))
yrend <- "year end"
Gift_00_09_08_yrend$Original_Gift_Source <- yrend
#showcase
Gift_00_09_08_remg_08 <- sqldf("select * from Gift_00_09_08_remg_07 where account_id not in (select account_id from Gift_00_09_08_yrend) ")
Gift_00_09_08_showcase <- Gift_00_09_08_remg_08 %>% filter(str_detect(Original_Gift_Source, "showcase"))
showcase <- "showcase"
Gift_00_09_08_showcase$Original_Gift_Source <- showcase
#special opportunity
Gift_00_09_08_remg_09 <- sqldf("select * from Gift_00_09_08_remg_08 where account_id not in (select account_id from Gift_00_09_08_showcase) ")
Gift_00_09_08_so <- Gift_00_09_08_remg_09 %>% filter(str_detect(Original_Gift_Source, "oppo|opp|special|auction|classic|gift|car|celtic|war|christian|memory|laps|dmw|match|pitch|give|sponser|founder|trip"))
opportunity <- "special opportunity"
Gift_00_09_08_so$Original_Gift_Source <- opportunity
#membership drive
Gift_00_09_08_remg_10 <- sqldf("select * from Gift_00_09_08_remg_09 where account_id not in (select account_id from Gift_00_09_08_so) ")
Gift_00_09_08_mbr <- Gift_00_09_08_remg_10 %>% filter(str_detect(Original_Gift_Source, "mbrship|member|pledge"))
mbr <- "membership drive"
Gift_00_09_08_mbr$Original_Gift_Source <- radio
#remg-On Air
Gift_00_09_08_remg_11 <- sqldf("select * from Gift_00_09_08_remg_10 where account_id not in (select account_id from Gift_00_09_08_mbr) ")
Gift_00_09_08_remg_radio <- sqldf("select * from Gift_00_09_08_remg_11 where solicitation_method like '%Air%'")
Gift_00_09_08_remg_radio$Original_Gift_Source <- radio
#remg-Mail
Gift_00_09_08_remg_12 <- sqldf("select * from Gift_00_09_08_remg_11 where account_id not in (select account_id from Gift_00_09_08_remg_radio) ")
Gift_00_09_08_remg_mail <- sqldf("select * from Gift_00_09_08_remg_12 where solicitation_method like '%Mail%'")
Gift_00_09_08_remg_mail$Original_Gift_Source <- mail
Gift_00_09_08_source_root <- rbind(Gift_00_09_08_radio, Gift_00_09_08_tv, Gift_00_09_08_web, Gift_00_09_08_newyear, Gift_00_09_08_acq, Gift_00_09_08_af, Gift_00_09_08_otw, Gift_00_09_08_mail, Gift_00_09_08_yrend, Gift_00_09_08_showcase, Gift_00_09_08_so, Gift_00_09_08_mbr, Gift_00_09_08_remg_radio, Gift_00_09_08_remg_mail)
################
#outside US
outside_US$Original_Gift_Source <- tolower(outside_US$Original_Gift_Source)
#radio/fm
outside_US_radio <- outside_US %>% filter(str_detect(Original_Gift_Source, "fm|radio|am|air|renewal|adjust"))
radio <- "radio"
outside_US_radio$Original_Gift_Source <- radio
#tv
outside_US_remg <- sqldf("select * from outside_US where account_id not in (select account_id from outside_US_radio) ")
outside_US_remg_tv <- outside_US_remg %>% filter(str_detect(Original_Gift_Source, "tv|toop"))
tv <- "tv"
outside_US_remg_tv$Original_Gift_Source <- tv
outside_US_remg_01 <- sqldf("select * from outside_US_remg where account_id not in (select account_id from outside_US_remg_tv) ")
outside_US_remg_passport <- outside_US_remg_01 %>% filter(str_detect(Original_Gift_Source, "passport"))
passport <- "passport"
outside_US_remg_passport$Original_Gift_Source <- passport
outside_US_remg_02 <- sqldf("select * from outside_US_remg_01 where account_id not in (select account_id from outside_US_remg_passport) ")
outside_US_remg_onair <- sqldf("select * from outside_US_remg_02 where solicitation_method like '%On Air%'")
outside_US_remg_onair$Original_Gift_Source <- radio
outside_US_remg_03 <- sqldf("select * from outside_US_remg_02 where account_id not in (select account_id from outside_US_remg_onair) ")
outside_US_remg_web <- sqldf("select * from outside_US_remg_03 where solicitation_method like '%eb%'")
outside_US_remg_web$Original_Gift_Source <- web
outside_US_source_root <- rbind(outside_US_radio, outside_US_remg_tv, outside_US_remg_passport, outside_US_remg_onair, outside_US_remg_web)
Splitting into Regions and Divisions
#Filter states into 7 regions in United States
USNED1 <- c("CT", "ME", "MA", "NH", "RI", "VT")
USNED2 <- c("NJ", "NY", "PA")
USMWD3 <- c("IL", "IN", "MI", "OH", "WI")
USMWD4 <- c("IA", "KS", "MN", "MO", "NE", "ND", "SD")
USSD5 <- c("DE", "FL", "GA", "MD", "NC", "SC", "VA", "DC", "WV")
USSD6 <- c("AL", "KY", "MS", "TN")
USSD7 <- c("AR", "LA", "OK", "TX")
USWD8 <- c("AZ", "CO", "ID", "MO", "NV", "NM", "UT", "WY")
USWD9 <- c("AL", "CA", "HI", "OR", "WA")
#Filter for US Commonwealth and Territories
USComTer <- c("AS", "FM", "GU", "MH", "MP", "PW", "PR", "VI")
#outside US
G09_08_root <- Gift_00_09_08_source_root %>% mutate(Division = ifelse(State %in% USNED1 , 1, ifelse(State %in% USNED2 , 2, ifelse(State %in% USMWD3, 3, ifelse(State %in% USMWD4, 4, ifelse(State %in% USSD5, 5, ifelse(State %in% USSD6, 6, ifelse(State %in% USSD7, 7, ifelse(State %in% USWD8, 8, ifelse(State %in% USWD9, 9, ifelse(State %in% USComTer, 10, 0)))))))))))
G09_08_root_01 <-G09_08_root %>% mutate(Region = ifelse(Division == 1 , 1, ifelse(Division == 2 , 1, ifelse(Division == 3, 2, ifelse(Division == 4, 2, ifelse(Division == 5, 3, ifelse(Division == 6, 3, ifelse(Division == 7, 3, ifelse(Division == 8, 4, ifelse(Division == 9, 4, 5))))))))))
G10_18_root <- Gift_10_18_08_source_root %>% mutate(Division = ifelse(State %in% USNED1 , 1, ifelse(State %in% USNED2 , 2, ifelse(State %in% USMWD3, 3, ifelse(State %in% USMWD4, 4, ifelse(State %in% USSD5, 5, ifelse(State %in% USSD6, 6, ifelse(State %in% USSD7, 7, ifelse(State %in% USWD8, 8, ifelse(State %in% USWD9, 9, ifelse(State %in% USComTer, 10, 0)))))))))))
G10_18_root_01 <- G10_18_root %>% mutate(Region = ifelse(Division == 1 , 1, ifelse(Division == 2 , 1, ifelse(Division == 3, 2, ifelse(Division == 4, 2, ifelse(Division == 5, 3, ifelse(Division == 6, 3, ifelse(Division == 7, 3, ifelse(Division == 8, 4, ifelse(Division == 9, 4, 5))))))))))
#Organization
GOrg_root <- Gift_Org_Final_source_root %>% mutate(Division = ifelse(State %in% USNED1 , 1, ifelse(State %in% USNED2 , 2, ifelse(State %in% USMWD3, 3, ifelse(State %in% USMWD4, 4, ifelse(State %in% USSD5, 5, ifelse(State %in% USSD6, 6, ifelse(State %in% USSD7, 7, ifelse(State %in% USWD8, 8, ifelse(State %in% USWD9, 9, ifelse(State %in% USComTer, 10, 0)))))))))))
GOrg_root_01 <- GOrg_root %>% mutate(Region = ifelse(State %in% USNED1 , 1, ifelse(State %in% USNED2 , 2, ifelse(State %in% USMWD3, 3, ifelse(State %in% USMWD4, 4, ifelse(State %in% USSD5, 5, ifelse(State %in% USSD6, 6, ifelse(State %in% USSD7, 7, ifelse(State %in% USWD8, 8, ifelse(State %in% USWD9, 9, ifelse(State %in% USComTer, 10, 0)))))))))))
#Gift OH
Gift_OH_18_root_01 <- Gift_OH_18_source_root %>% mutate(Division = ifelse(State %in% USNED1 , 1, ifelse(State %in% USNED2 , 2, ifelse(State %in% USMWD3, 3, ifelse(State %in% USMWD4, 4, ifelse(State %in% USSD5, 5, ifelse(State %in% USSD6, 6, ifelse(State %in% USSD7, 7, ifelse(State %in% USWD8, 8, ifelse(State %in% USWD9, 9, ifelse(State %in% USComTer, 10, 0)))))))))))
Gift_OH_18_root_02 <- Gift_OH_18_root_01 %>% mutate(Region = ifelse(State %in% USNED1 , 1, ifelse(State %in% USNED2 , 2, ifelse(State %in% USMWD3, 3, ifelse(State %in% USMWD4, 4, ifelse(State %in% USSD5, 5, ifelse(State %in% USSD6, 6, ifelse(State %in% USSD7, 7, ifelse(State %in% USWD8, 8, ifelse(State %in% USWD9, 9, ifelse(State %in% USComTer, 10, 0)))))))))))
Gift_OH_09_root_01 <- Gift_OH_09_source_root %>% mutate(Division = ifelse(State %in% USNED1 , 1, ifelse(State %in% USNED2 , 2, ifelse(State %in% USMWD3, 3, ifelse(State %in% USMWD4, 4, ifelse(State %in% USSD5, 5, ifelse(State %in% USSD6, 6, ifelse(State %in% USSD7, 7, ifelse(State %in% USWD8, 8, ifelse(State %in% USWD9, 9, ifelse(State %in% USComTer, 10, 0)))))))))))
Gift_OH_09_root_02 <- Gift_OH_09_root_01 %>% mutate(Region = ifelse(State %in% USNED1 , 1, ifelse(State %in% USNED2 , 2, ifelse(State %in% USMWD3, 3, ifelse(State %in% USMWD4, 4, ifelse(State %in% USSD5, 5, ifelse(State %in% USSD6, 6, ifelse(State %in% USSD7, 7, ifelse(State %in% USWD8, 8, ifelse(State %in% USWD9, 9, ifelse(State %in% USComTer, 10, 0)))))))))))
Main files
GOrg_root_02 <- GOrg_root_01 %>% mutate(Donors_from = "Organizations")
Gout_US <- outside_US_source_root %>% mutate(Donors_from = "Outside_US")
G10_18_root_02 <- G10_18_root_01 %>% mutate(Donors_from = "Outside-OH-2010-2018")
G09_08_root_02 <- G09_08_root_01 %>% mutate(Donors_from = "Outside-OH-2000-2009")
Gift_OH_18_root_01 <- Gift_OH_18_root_02 %>% mutate(Donors_from = "OH-2010-2018")
Gift_OH_09_root_01 <- Gift_OH_09_root_02 %>% mutate(Donors_from = "OH-2000-2009")
GOrg_root_03 <- GOrg_root_02[ , -c(2,3,6)]
Gout_US_01 <- Gout_US[ , -c(2,3,6)]
G10_18_root_03 <- G10_18_root_02[ , -c(2,3,4,5,6,9)]
G09_08_root_03 <- G09_08_root_02[ , -c(2,3,4,5,6,9)]
Gift_OH_18_root_02 <- Gift_OH_18_root_01[ , -c(2,3,4,5,6,9)]
Gift_OH_09_root_02 <- Gift_OH_09_root_01[ , -c(2,3,4,5,6,9)]
Final_file <- rbind(G10_18_root_03, G09_08_root_03, Gift_OH_18_root_02, Gift_OH_09_root_02 )
Final_OH <- rbind(Gift_OH_18_root_02, Gift_OH_09_root_02)
Final_OutsideOH <- rbind(G10_18_root_03, G09_08_root_03)
Final_OH_kNN <- kNN(Final_OH, variable = c("gender"), k = 5)
sqldf("select * from Final_OH where gender = 'NA' ")
Final_OutsideOH_kNN <- kNN(Final_OutsideOH, variable = c("gender"), k = 5)
sqldf("select * from Final_OutsideOH where gender = 'NA' ")
y_finalOH_columbus <- Final_OH_kNN %>% filter(City == "Columbus")
x <- y_finalOH_columbus %>% select(Gift_Month, Gift_Year, Original_Gift_Source, Solicitation_Method, Original_Gift_Amount, Solicitation_Type )
#add dummy variables for Original Gift Source
y_01 <- x %>% mutate(GS_radio = ifelse(Original_Gift_Source == "radio", 1, 0))
y_02 <- y_01 %>% mutate(GS_acquisition = ifelse(Original_Gift_Source == "acquisition", 1, 0))
y_03 <- y_02 %>% mutate(GS_mail = ifelse(Original_Gift_Source == "mail", 1, 0))
y_04 <- y_03 %>% mutate(GS_otwhite = ifelse(Original_Gift_Source == "otwhite", 1, 0))
y_05 <- y_04 %>% mutate(GS_specialopportunity = ifelse(Original_Gift_Source == "special opportunity", 1, 0))
y_06 <- y_05 %>% mutate(GS_yearend = ifelse(Original_Gift_Source == "yearend", 1, 0))
y_07 <- y_06 %>% mutate(GS_newyear = ifelse(Original_Gift_Source == "newyear", 1, 0))
y_08 <- y_07 %>% mutate(GS_showcase = ifelse(Original_Gift_Source == "showcase", 1, 0))
y_09 <- y_08 %>% mutate(GS_passport = ifelse(Original_Gift_Source == "passport", 1, 0))
y_10 <- y_09 %>% mutate(GS_annfund = ifelse(Original_Gift_Source == "annual fund", 1, 0))
y_11 <- y_10 %>% mutate(GS_tv = ifelse(Original_Gift_Source == "tv", 1, 0))
y_12 <- y_11 %>% mutate(GS_web = ifelse(Original_Gift_Source == "web", 1, 0))
# add dummy variables for Solicitation method
y_13 <- y_12 %>% mutate(SM_onair = ifelse(Solicitation_Method == "On Air", 1, 0))
y_14 <- y_13 %>% mutate(SM_autoren = ifelse(Solicitation_Method == "Auto Renewal", 1, 0))
y_15 <- y_14 %>% mutate(SM_other = ifelse(Solicitation_Method == "Other", 1, 0))
y_16 <- y_15 %>% mutate(SM_web = ifelse(Solicitation_Method == "Web", 1, 0))
y_17 <- y_16 %>% mutate(SM_directmail = ifelse(Solicitation_Method == "Direct Mail", 1, 0))
y_18 <- y_17 %>% mutate(SM_email = ifelse(Solicitation_Method == "Email", 1, 0))
y_19 <- y_18 %>% mutate(SM_perscont = ifelse(Solicitation_Method == "Personal Contact", 1, 0))
y_20 <- y_19 %>% mutate(SM_online = ifelse(Solicitation_Method == "Online", 1, 0))
y_21 <- y_20 %>% mutate(SM_telemkt = ifelse(Solicitation_Method == "Telemarketing", 1, 0))
#add dummy variable for Solicitation Type
y_22 <- y_21 %>% mutate(ST_acq = ifelse(Solicitation_Type == "Acquisition", 1, 0))
y_23 <- y_22 %>% mutate(ST_ren = ifelse(Solicitation_Type == "Renewal", 1, 0))
y_24 <- y_23 %>% mutate(ST_othr = ifelse(Solicitation_Type == "Other", 1, 0))
y_25 <- y_24 %>% mutate(ST_exp = ifelse(Solicitation_Type == "Expired", 1, 0))
y_26 <- y_25 %>% mutate(ST_addgift = ifelse(Solicitation_Type == "Additional Gift", 1, 0))
y_finalOH_CMH <- y_26
Regression model for Original gift source - first time donors in Columbus Ohio
model_01 <- lm(Original_Gift_Amount ~ Gift_Month + Gift_Year + GS_radio + GS_acquisition + GS_otwhite + GS_specialopportunity + GS_showcase + GS_passport + GS_annfund + GS_tv + GS_web + GS_yearend + GS_newyear , data = y_finalOH_CMH)
model_02 <- lm(Original_Gift_Amount ~ Gift_Month + Gift_Year +SM_onair + SM_autoren + SM_other + SM_web +SM_online +SM_telemkt, data = y_finalOH_CMH)
model_03 <- lm(Original_Gift_Amount ~ Gift_Month + Gift_Year + ST_acq + ST_ren + ST_othr + ST_exp + ST_addgift, data = y_finalOH_CMH)
model_04 <- lm(Original_Gift_Amount ~ Gift_Month + Gift_Year + GS_radio + GS_acquisition + GS_otwhite + GS_specialopportunity + GS_showcase + GS_passport + GS_annfund + GS_tv + GS_web + SM_other +SM_autoren, data = y_finalOH_CMH)
n=22255
y <- as.numeric(y_finalOH_CMH$Original_Gift_Amount)
a<- y_finalOH_CMH %>% select(Gift_Month , Gift_Year, GS_radio , GS_acquisition , GS_otwhite , GS_specialopportunity , GS_showcase , GS_passport , GS_annfund , GS_tv , GS_web , GS_yearend , GS_newyear , SM_onair , SM_autoren , SM_other , SM_web , SM_directmail , SM_email , SM_perscont , SM_online , SM_telemkt , ST_acq , ST_ren , ST_othr , ST_exp , ST_addgift)
X1 <- cbind(rep(1,n), a$Gift_Month, a$Gift_Year, a$GS_radio, a$GS_acquisition, a$GS_otwhite, a$GS_specialopportunity, a$GS_showcase, a$GS_passport, a$GS_annfund, a$GS_tv, a$GS_web, a$GS_yearend, a$GS_newyear)
X2 <- cbind(rep(1,n), a$Gift_Month, a$Gift_Year, a$SM_onair , a$SM_autoren , a$SM_other , a$SM_web, a$SM_online, a$SM_telemkt )
X3 <- cbind(rep(1,n), a$Gift_Month, a$Gift_Year, a$ST_acq , a$ST_ren , a$ST_othr , a$ST_exp, a$ST_addgift )
X4 <- cbind(rep(1,n), a$Gift_Month, a$Gift_Year, a$GS_radio, a$GS_acquisition, a$GS_otwhite, a$GS_specialopportunity, a$GS_showcase, a$GS_passport, a$GS_annfund, a$GS_tv, a$GS_web, a$SM_other , a$SM_autoren)
Regression
summary(model_01)
##
## Call:
## lm(formula = Original_Gift_Amount ~ Gift_Month + Gift_Year +
## GS_radio + GS_acquisition + GS_otwhite + GS_specialopportunity +
## GS_showcase + GS_passport + GS_annfund + GS_tv + GS_web +
## GS_yearend + GS_newyear, data = y_finalOH_CMH)
##
## Residuals:
## Min 1Q Median 3Q Max
## -158.3 -49.6 -10.4 19.5 4879.0
##
## Coefficients: (2 not defined because of singularities)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 848.32060 265.92255 3.190 0.00142 **
## Gift_Month -0.08464 0.19136 -0.442 0.65826
## Gift_Year -0.39425 0.13207 -2.985 0.00284 **
## GS_radio 57.20140 3.16451 18.076 < 2e-16 ***
## GS_acquisition -13.89571 3.76051 -3.695 0.00022 ***
## GS_otwhite 71.56062 6.50313 11.004 < 2e-16 ***
## GS_specialopportunity 40.10179 4.85244 8.264 < 2e-16 ***
## GS_showcase 137.23059 12.12359 11.319 < 2e-16 ***
## GS_passport 11.91788 4.22445 2.821 0.00479 **
## GS_annfund 68.89976 10.94693 6.294 3.15e-10 ***
## GS_tv 104.06042 3.29296 31.601 < 2e-16 ***
## GS_web 45.12795 4.47700 10.080 < 2e-16 ***
## GS_yearend NA NA NA NA
## GS_newyear NA NA NA NA
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 99.72 on 22243 degrees of freedom
## Multiple R-squared: 0.1183, Adjusted R-squared: 0.1179
## F-statistic: 271.3 on 11 and 22243 DF, p-value: < 2.2e-16
Bayesian
Data=list(y=y,X=X1)
R = 1000
keep = 20
k = 14 ## Number of independent parameters (beta0,beta1)
betabar=rep(0,k) ## Zero vector of dimensions equal to the number of independent
A=diag(rep(0.01,k)) ## Diagonal matrix of dimension k=2. The diagonal elements are 0.1.
nu =3
ssq = 1
Prior<-list(betabar=betabar,A=A,nu=nu,ssq=ssq)
nprint = 0
out_model_X1 = runireg(Data,Mcmc=list(R=R))
##
## Starting IID Sampler for Univariate Regression Model
## with 22255 observations
##
## Prior Parms:
## betabar
## [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## A
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
## [1,] 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
## [2,] 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
## [3,] 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
## [4,] 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
## [5,] 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
## [6,] 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00
## [7,] 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00
## [8,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00
## [9,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00
## [10,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00
## [11,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00
## [12,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00
## [13,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01
## [14,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
## [,14]
## [1,] 0.00
## [2,] 0.00
## [3,] 0.00
## [4,] 0.00
## [5,] 0.00
## [6,] 0.00
## [7,] 0.00
## [8,] 0.00
## [9,] 0.00
## [10,] 0.00
## [11,] 0.00
## [12,] 0.00
## [13,] 0.00
## [14,] 0.01
## nu = 3 ssq= 11273.21
##
## MCMC parms:
## R= 1000 keep= 1 nprint= 100
##
## MCMC Iteration (est time to end - min)
## 100 (0.0)
## 200 (0.1)
## 300 (0.0)
## 400 (0.1)
## 500 (0.0)
## 600 (0.0)
## 700 (0.0)
## 800 (0.0)
## 900 (0.0)
## 1000 (0.0)
## Total Time Elapsed: 0.07
model_X1_betadraw <- as.matrix(summary(out_model_X1$betadraw))
## Summary of Posterior Marginal Distributions
## Moments
## mean std dev num se rel eff sam size
## 1 805.567 249.50 7.0624 0.72 900
## 2 -0.094 0.20 0.0058 0.75 900
## 3 -0.373 0.12 0.0035 0.72 900
## 4 57.220 3.12 0.1303 1.57 450
## 5 -13.800 3.70 0.1481 1.44 450
## 6 71.706 6.67 0.2612 1.38 450
## 7 40.031 4.68 0.1401 0.81 900
## 8 136.665 11.91 0.4086 1.06 450
## 9 11.911 4.13 0.1625 1.39 450
## 10 68.614 10.00 0.3688 1.22 450
## 11 104.203 3.26 0.1435 1.75 450
## 12 45.103 4.38 0.1721 1.39 450
## 13 12.047 985.95 33.6826 1.05 450
## 14 55.164 1004.96 32.9265 0.97 900
##
## Quantiles
## 2.5% 5% 50% 95% 97.5%
## 1 320.68 405.66 800.69 1233.46 1318.63
## 2 -0.49 -0.41 -0.09 0.23 0.29
## 3 -0.63 -0.59 -0.37 -0.17 -0.13
## 4 50.98 52.04 57.25 62.16 63.03
## 5 -20.82 -19.73 -13.81 -7.88 -6.49
## 6 58.76 60.85 71.91 82.22 84.01
## 7 30.53 32.14 40.13 47.60 49.17
## 8 114.25 117.53 136.63 156.27 159.87
## 9 3.66 5.34 11.85 18.38 19.70
## 10 47.85 52.23 68.55 84.89 88.82
## 11 97.69 99.03 104.31 109.40 110.39
## 12 36.61 37.84 45.10 52.15 53.39
## 13 -1927.78 -1666.27 -12.97 1565.66 1891.21
## 14 -1984.19 -1696.08 94.26 1610.27 1905.37
## based on 900 valid draws (burn-in=100)
rownames(model_X1_betadraw) <- c("Intercept", "Gift_Month", "Gift_Year", "GS_radio", "GS_acquisition", "GS_otwhite", "GS_specialopportunity", "GS_showcase", "GS_passport", "GS_annfund", "GS_tv", "GS_web", "GS_yearend", "GS_newyear")
model_X1_betadraw
## mean std dev num se rel eff
## Intercept 805.56685877 249.4969214 7.062375326 0.7211303
## Gift_Month -0.09383613 0.1999197 0.005756945 0.7463035
## Gift_Year -0.37297280 0.1238844 0.003501847 0.7191243
## GS_radio 57.21956773 3.1160808 0.130322895 1.5742240
## GS_acquisition -13.79973816 3.6998705 0.148144850 1.4429211
## GS_otwhite 71.70557238 6.6720162 0.261223357 1.3795973
## GS_specialopportunity 40.03059241 4.6766241 0.140078297 0.8074568
## GS_showcase 136.66510331 11.9062809 0.408627782 1.0600981
## GS_passport 11.91074725 4.1305814 0.162515405 1.3931863
## GS_annfund 68.61353184 10.0049064 0.368776789 1.2227667
## GS_tv 104.20304651 3.2562127 0.143475676 1.7473244
## GS_web 45.10338679 4.3784269 0.172117742 1.3907773
## GS_yearend 12.04668356 985.9480141 33.682616484 1.0503792
## GS_newyear 55.16436866 1004.9642630 32.926457906 0.9661205
## sam size
## Intercept 900
## Gift_Month 900
## Gift_Year 900
## GS_radio 450
## GS_acquisition 450
## GS_otwhite 450
## GS_specialopportunity 900
## GS_showcase 450
## GS_passport 450
## GS_annfund 450
## GS_tv 450
## GS_web 450
## GS_yearend 450
## GS_newyear 900
names(out_model_X1)
## [1] "betadraw" "sigmasqdraw"
## plotting examples
plot(out_model_X1$betadraw)
par(mfrow = c(1,2))
draw=cbind(out_model_X1$betadraw,out_model_X1$sigsqdraw)
matplot(draw,type="l",col=c(1:4))
for (b in 1:k){
abline(betabar[b],0,col=b)
}
plot(out_model_X1$sigmasqdraw,type="l")
## Warning in plot.window(xlim, ylim, "", ...): graphical parameter "type" is
## obsolete
## Warning in title(main = main, sub = sub, xlab = xlab, ylab = ylab, ...):
## graphical parameter "type" is obsolete
## Warning in axis(1, ...): graphical parameter "type" is obsolete
## Warning in axis(2, ...): graphical parameter "type" is obsolete
# histogram
par(mfrow = c(1,2))
hist(out_model_X1$betadraw, breaks = 30,
main = "All Gift Source,Gift Yr and Months",
yaxt = "n", yaxs="i",
xlab = "Posterior Dist. of beta", ylab = "", col = "dodgerblue4", border = "gray")
In Model1, we see that Gift source as radio, otwhite, special opportunity, showcase, passport, annual fund, tv, web and yearend has a positive impact on the first time donations in Columbus Ohio in both OLS and Bayesian regression. We are assuming the prior to be non-informative. The estimate of posterior parameters is converging to PLS which means the likelihood washes out prior. The mean of the coefficient estimates lies within the credible interval of 2.5% and 97.5% and is significant as is in OLS. We also see that the credible range of possible values of model parameters is very small representing greater confidence in the model parameters. Lower autocorelation shows that it is more efficient in estimating the posterior with accuracy. When we look at the mixing properties of the identfied and unidentified parameters, it is clear that even though sigma(unidentified parameters) donot seem to mix,the variance is very high, beta is mixing relatively well.
Regression
summary(model_02)
##
## Call:
## lm(formula = Original_Gift_Amount ~ Gift_Month + Gift_Year +
## SM_onair + SM_autoren + SM_other + SM_web + SM_online + SM_telemkt,
## data = y_finalOH_CMH)
##
## Residuals:
## Min 1Q Median 3Q Max
## -146.6 -51.4 -16.9 23.5 4949.3
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 931.8086 259.9739 3.584 0.000339 ***
## Gift_Month 0.2164 0.1933 1.120 0.262905
## Gift_Year -0.4395 0.1294 -3.396 0.000685 ***
## SM_onair 77.7644 1.8880 41.188 < 2e-16 ***
## SM_autoren 78.2685 16.6211 4.709 2.5e-06 ***
## SM_other 96.9617 7.4031 13.097 < 2e-16 ***
## SM_web 32.0564 2.9882 10.728 < 2e-16 ***
## SM_online -22.3498 101.8642 -0.219 0.826334
## SM_telemkt -16.4418 72.0395 -0.228 0.819467
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 101.8 on 22246 degrees of freedom
## Multiple R-squared: 0.08032, Adjusted R-squared: 0.07999
## F-statistic: 242.9 on 8 and 22246 DF, p-value: < 2.2e-16
Bayesian
Data=list(y=y,X=X2)
R = 1000
keep = 20
k = 9 ## Number of independent parameters (beta0,beta1)
betabar=rep(0,k) ## Zero vector of dimensions equal to the number of independent
A=diag(rep(0.01,k)) ## Diagonal matrix of dimension k=2. The diagonal elements are 0.1.
nu =3
ssq = 1
Prior<-list(betabar=betabar,A=A,nu=nu,ssq=ssq)
nprint = 0
out_model_X2 = runireg(Data,Mcmc=list(R=R))
##
## Starting IID Sampler for Univariate Regression Model
## with 22255 observations
##
## Prior Parms:
## betabar
## [1] 0 0 0 0 0 0 0 0 0
## A
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
## [1,] 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
## [2,] 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00
## [3,] 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00
## [4,] 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00
## [5,] 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00
## [6,] 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00
## [7,] 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00
## [8,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00
## [9,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01
## nu = 3 ssq= 11273.21
##
## MCMC parms:
## R= 1000 keep= 1 nprint= 100
##
## MCMC Iteration (est time to end - min)
## 100 (0.0)
## 200 (0.0)
## 300 (0.0)
## 400 (0.0)
## 500 (0.0)
## 600 (0.0)
## 700 (0.0)
## 800 (0.0)
## 900 (0.0)
## 1000 (0.0)
## Total Time Elapsed: 0.03
model_X2_betadraw <- as.matrix(summary(out_model_X2$betadraw))
## Summary of Posterior Marginal Distributions
## Moments
## mean std dev num se rel eff sam size
## 1 870.15 247.50 7.9842 0.94 900
## 2 0.22 0.20 0.0072 1.16 450
## 3 -0.41 0.12 0.0040 0.93 900
## 4 77.81 1.80 0.0503 0.70 900
## 5 77.74 17.04 0.5361 0.89 900
## 6 97.06 7.62 0.2844 1.25 450
## 7 31.86 2.97 0.0893 0.81 900
## 8 -26.45 101.12 3.4461 1.05 450
## 9 -15.50 69.94 2.5457 1.19 450
##
## Quantiles
## 2.5% 5% 50% 95% 97.5%
## 1 420.12 485.12 866.93 1263.78 1356.15
## 2 -0.20 -0.12 0.23 0.55 0.60
## 3 -0.65 -0.60 -0.41 -0.22 -0.18
## 4 74.36 74.81 77.79 80.91 81.28
## 5 43.43 49.68 77.85 106.25 110.96
## 6 81.79 84.99 96.96 109.27 111.88
## 7 26.00 26.89 31.90 36.43 37.51
## 8 -215.25 -182.84 -25.32 154.09 180.00
## 9 -142.39 -124.19 -14.61 97.56 126.65
## based on 900 valid draws (burn-in=100)
rownames(model_X2_betadraw) <- c("Intercept", "Gift_Month", "Gift_Year", "SM_onair" , "SM_autoren" , "SM_other" , "SM_web" , "SM_online" , "SM_telemkt")
model_X2_betadraw
## mean std dev num se rel eff sam size
## Intercept 870.1513958 247.4972263 7.984160150 0.9366133 900
## Gift_Month 0.2214141 0.2013358 0.007234702 1.1620953 450
## Gift_Year -0.4088204 0.1231790 0.003969384 0.9345781 900
## SM_onair 77.8140816 1.8041392 0.050254800 0.6983249 900
## SM_autoren 77.7444931 17.0428815 0.536117080 0.8905858 900
## SM_other 97.0613672 7.6169347 0.284397392 1.2546808 450
## SM_web 31.8629568 2.9738273 0.089342788 0.8123254 900
## SM_online -26.4529688 101.1218887 3.446050091 1.0451902 450
## SM_telemkt -15.4982681 69.9368217 2.545721435 1.1924838 450
## plotting examples
plot(out_model_X2$betadraw)
plot(out_model_X2$sigmasqdraw,type="l")
## Warning in plot.window(xlim, ylim, "", ...): graphical parameter "type" is
## obsolete
## Warning in title(main = main, sub = sub, xlab = xlab, ylab = ylab, ...):
## graphical parameter "type" is obsolete
## Warning in axis(1, ...): graphical parameter "type" is obsolete
## Warning in axis(2, ...): graphical parameter "type" is obsolete
par(mfrow = c(1,2))
draw=cbind(out_model_X2$betadraw,out_model_X2$sigsqdraw)
matplot(draw,type="l",col=c(1:4))
for (b in 1:k){
abline(betabar[b],0,col=b)
}
# histogram
par(mfrow = c(1,2))
hist(out_model_X2$betadraw, breaks = 30,
main = "All Sol Methd,Gift Yr and Months",
yaxt = "n", yaxs="i",
xlab = "Posterior Dist. of beta", ylab = "", col = "dodgerblue4", border = "gray")
In Model2 we see that Solicitation Method as onair, autoren, web and other has a positive impact on the first time donations in Columbus Ohio in both OLS and Bayesian regression. Online and tele-marketing has a negative effect. We are assuming the prior to be non-informative. The estimate of posterior parameters is converging to PLS which means the likelihood washes out prior. The mean of the coefficient estimates lies within the credible interval of 2.5% and 97.5% and is significant as is in OLS. We also see that the credible range of possible values of model parameters is very small representing greater confidence in the model parameters. Lower autocorelation shows that it is more efficient in estimating the posterior with accuracy. When we look at the mixing properties of the identfied and unidentified parameters, it is clear that even though sigma(unidentified parameters) donot seem to mix,the variance is very high, beta is mixing relatively well.
Regression
summary(model_03)
##
## Call:
## lm(formula = Original_Gift_Amount ~ Gift_Month + Gift_Year +
## ST_acq + ST_ren + ST_othr + ST_exp + ST_addgift, data = y_finalOH_CMH)
##
## Residuals:
## Min 1Q Median 3Q Max
## -114.9 -58.0 -23.5 37.4 4895.7
##
## Coefficients: (1 not defined because of singularities)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1294.8216 256.8607 5.041 4.67e-07 ***
## Gift_Month -0.4473 0.2008 -2.228 0.02591 *
## Gift_Year -0.6111 0.1273 -4.800 1.60e-06 ***
## ST_acq 47.7283 15.6671 3.046 0.00232 **
## ST_ren 25.8820 19.8722 1.302 0.19279
## ST_othr 47.3972 15.8778 2.985 0.00284 **
## ST_exp -14.5909 32.3831 -0.451 0.65230
## ST_addgift NA NA NA NA
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 106.1 on 22248 degrees of freedom
## Multiple R-squared: 0.002074, Adjusted R-squared: 0.001804
## F-statistic: 7.705 on 6 and 22248 DF, p-value: 2.721e-08
Bayesian
Data=list(y=y,X=X3)
R = 1000
keep = 20
k = 8 ## Number of independent parameters (beta0,beta1)
betabar=rep(0,k) ## Zero vector of dimensions equal to the number of independent
A=diag(rep(0.01,k)) ## Diagonal matrix of dimension k=2. The diagonal elements are 0.1.
nu =3
ssq = 1
Prior<-list(betabar=betabar,A=A,nu=nu,ssq=ssq)
nprint = 0
out_model_X3 = runireg(Data,Mcmc=list(R=R))
##
## Starting IID Sampler for Univariate Regression Model
## with 22255 observations
##
## Prior Parms:
## betabar
## [1] 0 0 0 0 0 0 0 0
## A
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00
## [2,] 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00
## [3,] 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00
## [4,] 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00
## [5,] 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00
## [6,] 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00
## [7,] 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00
## [8,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01
## nu = 3 ssq= 11273.21
##
## MCMC parms:
## R= 1000 keep= 1 nprint= 100
##
## MCMC Iteration (est time to end - min)
## 100 (0.0)
## 200 (0.0)
## 300 (0.0)
## 400 (0.0)
## 500 (0.0)
## 600 (0.0)
## 700 (0.0)
## 800 (0.0)
## 900 (0.0)
## 1000 (0.0)
## Total Time Elapsed: 0.03
model_X3_betadraw <- as.matrix(summary(out_model_X3$betadraw))
## Summary of Posterior Marginal Distributions
## Moments
## mean std dev num se rel eff sam size
## 1 1059.33 473.06 18.7173 1.41 450
## 2 -0.44 0.20 0.0071 1.11 450
## 3 -0.58 0.12 0.0036 0.78 900
## 4 227.28 444.22 15.9658 1.16 450
## 5 204.92 444.11 15.9683 1.16 450
## 6 226.68 444.35 16.0060 1.17 450
## 7 165.30 444.19 16.1604 1.19 450
## 8 179.48 444.67 16.0447 1.17 450
##
## Quantiles
## 2.5% 5% 50% 95% 97.5%
## 1 162.43 269.78 1042.92 1829.09 2022.510
## 2 -0.82 -0.78 -0.44 -0.11 -0.056
## 3 -0.81 -0.78 -0.58 -0.38 -0.359
## 4 -673.15 -542.10 225.31 953.51 1081.475
## 5 -677.15 -551.62 205.66 928.10 1065.466
## 6 -672.86 -540.78 224.95 951.72 1081.005
## 7 -721.96 -594.26 173.27 878.02 1034.183
## 8 -698.86 -586.49 177.63 899.11 1026.545
## based on 900 valid draws (burn-in=100)
rownames(model_X3_betadraw) <- c("Intercept", "Gift_Month", "Gift_Year", "ST_acq" , "ST_ren" , "ST_othr" , "ST_exp", "ST_addgift")
model_X3_betadraw
## mean std dev num se rel eff sam size
## Intercept 1059.3304231 473.0589412 18.717338631 1.4089650 450
## Gift_Month -0.4449395 0.2019650 0.007102143 1.1129334 450
## Gift_Year -0.5832215 0.1207459 0.003564039 0.7841194 900
## ST_acq 227.2781949 444.2194751 15.965813123 1.1625976 450
## ST_ren 204.9231094 444.1116510 15.968296805 1.1635241 450
## ST_othr 226.6779519 444.3499653 16.005950830 1.1677642 450
## ST_exp 165.2988735 444.1928191 16.160369200 1.1912475 450
## ST_addgift 179.4750564 444.6678174 16.044734088 1.1717532 450
## plotting examples
plot(out_model_X3$betadraw)
plot(out_model_X3$sigmasqdraw,type="l")
## Warning in plot.window(xlim, ylim, "", ...): graphical parameter "type" is
## obsolete
## Warning in title(main = main, sub = sub, xlab = xlab, ylab = ylab, ...):
## graphical parameter "type" is obsolete
## Warning in axis(1, ...): graphical parameter "type" is obsolete
## Warning in axis(2, ...): graphical parameter "type" is obsolete
par(mfrow = c(1,2))
draw=cbind(out_model_X3$betadraw,out_model_X3$sigsqdraw)
matplot(draw,type="l",col=c(1:4))
for (b in 1:k){
abline(betabar[b],0,col=b)
}
# histogram
par(mfrow = c(1,2))
hist(out_model_X3$betadraw, breaks = 30,
main = "All Sol Types,Gift Yr and Months",
yaxt = "n", yaxs="i",
xlab = "Posterior Dist. of beta", ylab = "", col = "dodgerblue4", border = "gray")
In Model3 we see that posterior estimates of the Solicitation Types are very different from the OLS and also the credible range is very high and has 0 which makes it insignificant as compared to OLS where Solicitation type acquisition and other are significant. The estimate of posterior parameters is not converging to OLS which means the prior is prominent. We also see that the credible range of possible values of model parameters is very high representing very less confidence in the model parameters.
Regression
summary(model_04)
##
## Call:
## lm(formula = Original_Gift_Amount ~ Gift_Month + Gift_Year +
## GS_radio + GS_acquisition + GS_otwhite + GS_specialopportunity +
## GS_showcase + GS_passport + GS_annfund + GS_tv + GS_web +
## SM_other + SM_autoren, data = y_finalOH_CMH)
##
## Residuals:
## Min 1Q Median 3Q Max
## -173.5 -49.5 -10.5 19.0 4878.9
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 824.1480 265.8071 3.101 0.001934 **
## Gift_Month -0.2071 0.1918 -1.080 0.280296
## Gift_Year -0.3818 0.1320 -2.892 0.003831 **
## GS_radio 56.9960 3.1616 18.027 < 2e-16 ***
## GS_acquisition -13.8683 3.7561 -3.692 0.000223 ***
## GS_otwhite 70.3785 6.4971 10.832 < 2e-16 ***
## GS_specialopportunity 29.6501 5.0454 5.877 4.25e-09 ***
## GS_showcase 136.5252 12.1092 11.275 < 2e-16 ***
## GS_passport 11.7117 4.2194 2.776 0.005514 **
## GS_annfund -5.2614 14.7820 -0.356 0.721893
## GS_tv 103.9927 3.2890 31.619 < 2e-16 ***
## GS_web 44.9371 4.4716 10.049 < 2e-16 ***
## SM_other 74.7327 10.0276 7.453 9.48e-14 ***
## SM_autoren 15.6135 16.1971 0.964 0.335072
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 99.6 on 22241 degrees of freedom
## Multiple R-squared: 0.1205, Adjusted R-squared: 0.12
## F-statistic: 234.5 on 13 and 22241 DF, p-value: < 2.2e-16
Bayesian
Data=list(y=y,X=X4)
R = 1000
keep = 20
k = 16 ## Number of independent parameters (beta0,beta1)
betabar=rep(0,k) ## Zero vector of dimensions equal to the number of independent
A=diag(rep(0.01,k)) ## Diagonal matrix of dimension k=2. The diagonal elements are 0.1.
nu =3
ssq = 1
Prior<-list(betabar=betabar,A=A,nu=nu,ssq=ssq)
nprint = 0
out_model_X4 = runireg(Data,Mcmc=list(R=R))
##
## Starting IID Sampler for Univariate Regression Model
## with 22255 observations
##
## Prior Parms:
## betabar
## [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## A
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
## [1,] 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
## [2,] 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
## [3,] 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
## [4,] 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
## [5,] 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
## [6,] 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00
## [7,] 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00
## [8,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00
## [9,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00
## [10,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00
## [11,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00
## [12,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00
## [13,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01
## [14,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
## [,14]
## [1,] 0.00
## [2,] 0.00
## [3,] 0.00
## [4,] 0.00
## [5,] 0.00
## [6,] 0.00
## [7,] 0.00
## [8,] 0.00
## [9,] 0.00
## [10,] 0.00
## [11,] 0.00
## [12,] 0.00
## [13,] 0.00
## [14,] 0.01
## nu = 3 ssq= 11273.21
##
## MCMC parms:
## R= 1000 keep= 1 nprint= 100
##
## MCMC Iteration (est time to end - min)
## 100 (0.0)
## 200 (0.1)
## 300 (0.0)
## 400 (0.0)
## 500 (0.0)
## 600 (0.0)
## 700 (0.0)
## 800 (0.0)
## 900 (0.0)
## 1000 (0.0)
## Total Time Elapsed: 0.07
model_X4_betadraw <- as.matrix(summary(out_model_X4$betadraw))
## Summary of Posterior Marginal Distributions
## Moments
## mean std dev num se rel eff sam size
## 1 766.93 261.71 8.5786 0.97 900
## 2 -0.21 0.19 0.0075 1.37 450
## 3 -0.35 0.13 0.0043 0.97 900
## 4 57.04 3.07 0.1047 1.05 450
## 5 -13.61 3.61 0.1094 0.83 900
## 6 70.18 6.48 0.2816 1.70 450
## 7 29.65 5.17 0.1544 0.80 900
## 8 136.96 12.51 0.4592 1.21 450
## 9 11.57 4.27 0.1649 1.34 450
## 10 -5.24 14.17 0.4648 0.97 900
## 11 104.01 3.29 0.1117 1.03 450
## 12 44.83 4.55 0.1436 0.90 900
## 13 74.87 9.92 0.3041 0.85 900
## 14 16.06 15.97 0.4768 0.80 900
##
## Quantiles
## 2.5% 5% 50% 95% 97.5%
## 1 280.62 341.07 769.70 1195.44 1294.28
## 2 -0.57 -0.53 -0.21 0.11 0.17
## 3 -0.62 -0.57 -0.35 -0.14 -0.11
## 4 50.95 51.89 56.93 62.11 63.30
## 5 -20.86 -19.58 -13.71 -7.55 -6.50
## 6 57.47 59.16 70.32 80.53 81.92
## 7 18.90 21.03 29.73 38.09 39.77
## 8 113.75 116.98 136.31 157.67 162.55
## 9 3.19 4.68 11.70 18.58 19.97
## 10 -32.69 -29.51 -4.92 17.37 21.02
## 11 97.42 98.47 103.94 109.53 110.59
## 12 35.88 37.35 44.95 52.27 53.63
## 13 54.68 58.47 74.96 91.44 94.47
## 14 -14.03 -9.23 16.01 42.67 46.88
## based on 900 valid draws (burn-in=100)
rownames(model_X4_betadraw) <- c("Intercept", "Gift_Month", "Gift_Year", "GS_radio", "GS_acquisition", "GS_otwhite", "GS_specialopportunity", "GS_showcase", "GS_passport", "GS_annfund", "GS_tv", "GS_web", "SM_other" , "SM_autoren")
model_X4_betadraw
## mean std dev num se rel eff
## Intercept 766.9286637 261.7077393 8.578635475 0.9670430
## Gift_Month -0.2090328 0.1928651 0.007530735 1.3721763
## Gift_Year -0.3533348 0.1301035 0.004265580 0.9674319
## GS_radio 57.0372769 3.0699988 0.104737736 1.0475444
## GS_acquisition -13.6084698 3.6109245 0.109402995 0.8261599
## GS_otwhite 70.1788777 6.4812102 0.281621761 1.6992713
## GS_specialopportunity 29.6516370 5.1656375 0.154360710 0.8036523
## GS_showcase 136.9616491 12.5119363 0.459199658 1.2122622
## GS_passport 11.5746744 4.2675455 0.164892195 1.3436507
## GS_annfund -5.2396086 14.1683351 0.464825917 0.9686916
## GS_tv 104.0064469 3.2945119 0.111710463 1.0347800
## GS_web 44.8294740 4.5507074 0.143588643 0.8960348
## SM_other 74.8744855 9.9248401 0.304137914 0.8451555
## SM_autoren 16.0556082 15.9748138 0.476757480 0.8016152
## sam size
## Intercept 900
## Gift_Month 450
## Gift_Year 900
## GS_radio 450
## GS_acquisition 900
## GS_otwhite 450
## GS_specialopportunity 900
## GS_showcase 450
## GS_passport 450
## GS_annfund 900
## GS_tv 450
## GS_web 900
## SM_other 900
## SM_autoren 900
## plotting examples
plot(out_model_X4$betadraw)
plot(out_model_X4$sigmasqdraw,type="l")
## Warning in plot.window(xlim, ylim, "", ...): graphical parameter "type" is
## obsolete
## Warning in title(main = main, sub = sub, xlab = xlab, ylab = ylab, ...):
## graphical parameter "type" is obsolete
## Warning in axis(1, ...): graphical parameter "type" is obsolete
## Warning in axis(2, ...): graphical parameter "type" is obsolete
par(mfrow = c(1,2))
draw=cbind(out_model_X4$betadraw,out_model_X4$sigsqdraw)
matplot(draw,type="l",col=c(1:4))
for (b in 1:k){
abline(betabar[b],0,col=b)
}
# histogram
par(mfrow = c(1,2))
hist(out_model_X4$betadraw, breaks = 30,
main = "Gift src.Sol Methd,Yr and Months",
yaxt = "n", yaxs="i",
xlab = "Posterior Dist. of beta", ylab = "", col = "dodgerblue4", border = "gray")
In model4, we see that the Original Gift source radio, otwhite, special opportunity, showcase, passport, tv and web, Solicitation Method of autoren, and other has a positive impact on the first time donations in Columbus Ohio in both OLS and Bayesian regression. We are assuming the prior to be non-informative. Their estimate of posterior parameters is converging to OLS which means the likelihood washes out prior. The mean of the coefficient estimates lies within the credible interval of 2.5% and 97.5% and is significant as is in OLS. We also see that the credible range of possible values of model parameters is very small representing greater confidence in the model parameters. Lower autocorelation shows that it is more efficient in estimating the posterior with accuracy. When we look at the mixing properties of the identfied and unidentified parameters, it is clear that even though sigma(unidentified parameters) do not seem to mix,the variance is very high, beta is mixing relatively well.
y_finalOH_notCMH <- Final_OH_kNN %>% filter(City != "Columbus")
u <- y_finalOH_notCMH %>% select(Gift_Month, Gift_Year, Original_Gift_Source, Solicitation_Method, Original_Gift_Amount, Solicitation_Type )
#add dummy variables for Original Gift Source
u_01 <- u %>% mutate(GS_radio = ifelse(Original_Gift_Source == "radio", 1, 0))
u_02 <- u_01 %>% mutate(GS_acquisition = ifelse(Original_Gift_Source == "acquisition", 1, 0))
u_03 <- u_02 %>% mutate(GS_mail = ifelse(Original_Gift_Source == "mail", 1, 0))
u_04 <- u_03 %>% mutate(GS_otwhite = ifelse(Original_Gift_Source == "otwhite", 1, 0))
u_05 <- u_04 %>% mutate(GS_specialopportunity = ifelse(Original_Gift_Source == "special opportunity", 1, 0))
u_06 <- u_05 %>% mutate(GS_yearend = ifelse(Original_Gift_Source == "yearend", 1, 0))
u_07 <- u_06 %>% mutate(GS_newyear = ifelse(Original_Gift_Source == "newyear", 1, 0))
u_08 <- u_07 %>% mutate(GS_showcase = ifelse(Original_Gift_Source == "showcase", 1, 0))
u_09 <- u_08 %>% mutate(GS_passport = ifelse(Original_Gift_Source == "passport", 1, 0))
u_10 <- u_09 %>% mutate(GS_annfund = ifelse(Original_Gift_Source == "annual fund", 1, 0))
u_11 <- u_10 %>% mutate(GS_tv = ifelse(Original_Gift_Source == "tv", 1, 0))
u_12 <- u_11 %>% mutate(GS_web = ifelse(Original_Gift_Source == "web", 1, 0))
# add dummy variables for Solicitation method
u_13 <- u_12 %>% mutate(SM_onair = ifelse(Solicitation_Method == "On Air", 1, 0))
u_14 <- u_13 %>% mutate(SM_autoren = ifelse(Solicitation_Method == "Auto Renewal", 1, 0))
u_15 <- u_14 %>% mutate(SM_other = ifelse(Solicitation_Method == "Other", 1, 0))
u_16 <- u_15 %>% mutate(SM_web = ifelse(Solicitation_Method == "Web", 1, 0))
u_17 <- u_16 %>% mutate(SM_directmail = ifelse(Solicitation_Method == "Direct Mail", 1, 0))
u_18 <- u_17 %>% mutate(SM_email = ifelse(Solicitation_Method == "Email", 1, 0))
u_19 <- u_18 %>% mutate(SM_perscont = ifelse(Solicitation_Method == "Personal Contact", 1, 0))
u_20 <- u_19 %>% mutate(SM_online = ifelse(Solicitation_Method == "Online", 1, 0))
u_21 <- u_20 %>% mutate(SM_telemkt = ifelse(Solicitation_Method == "Telemarketing", 1, 0))
#add dummy variable for Solicitation Type
u_22 <- u_21 %>% mutate(ST_acq = ifelse(Solicitation_Type == "Acquisition", 1, 0))
u_23 <- u_22 %>% mutate(ST_ren = ifelse(Solicitation_Type == "Renewal", 1, 0))
u_24 <- u_23 %>% mutate(ST_othr = ifelse(Solicitation_Type == "Other", 1, 0))
u_25 <- u_24 %>% mutate(ST_exp = ifelse(Solicitation_Type == "Expired", 1, 0))
u_26 <- u_25 %>% mutate(ST_addgift = ifelse(Solicitation_Type == "Additional Gift", 1, 0))
y_finalOH_notCMH <- u_26
model_44 <- lm(Original_Gift_Amount ~ Gift_Month + Gift_Year + GS_radio + GS_acquisition + GS_otwhite + GS_specialopportunity + GS_showcase + GS_passport + GS_annfund + GS_tv + GS_web + SM_other +SM_autoren, data = y_finalOH_notCMH)
n=42698
y <- as.numeric(y_finalOH_notCMH$Original_Gift_Amount)
a<- y_finalOH_notCMH %>% select(Gift_Month , Gift_Year, GS_radio , GS_acquisition , GS_otwhite , GS_specialopportunity , GS_showcase , GS_passport , GS_annfund , GS_tv , GS_web , GS_yearend , GS_newyear , SM_onair , SM_autoren , SM_other , SM_web , SM_directmail , SM_email , SM_perscont , SM_online , SM_telemkt , ST_acq , ST_ren , ST_othr , ST_exp , ST_addgift)
X8 <- cbind(rep(1,n), a$Gift_Month, a$Gift_Year, a$GS_radio, a$GS_acquisition, a$GS_otwhite, a$GS_specialopportunity, a$GS_showcase, a$GS_passport, a$GS_annfund, a$GS_tv, a$GS_web, a$SM_other , a$SM_autoren)
summary(model_44)
##
## Call:
## lm(formula = Original_Gift_Amount ~ Gift_Month + Gift_Year +
## GS_radio + GS_acquisition + GS_otwhite + GS_specialopportunity +
## GS_showcase + GS_passport + GS_annfund + GS_tv + GS_web +
## SM_other + SM_autoren, data = y_finalOH_notCMH)
##
## Residuals:
## Min 1Q Median 3Q Max
## -176.0 -50.2 -14.3 26.0 4871.8
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -219.37740 182.03711 -1.205 0.22816
## Gift_Month -0.53534 0.12951 -4.134 3.58e-05 ***
## Gift_Year 0.13822 0.09039 1.529 0.12622
## GS_radio 71.74010 2.16650 33.113 < 2e-16 ***
## GS_acquisition -11.99617 2.53835 -4.726 2.30e-06 ***
## GS_otwhite 46.75149 4.46732 10.465 < 2e-16 ***
## GS_specialopportunity 36.73393 3.70357 9.919 < 2e-16 ***
## GS_showcase 141.86741 6.95104 20.410 < 2e-16 ***
## GS_passport 8.15250 3.12127 2.612 0.00901 **
## GS_annfund -45.30124 13.34787 -3.394 0.00069 ***
## GS_tv 114.19524 2.17648 52.468 < 2e-16 ***
## GS_web 52.44621 3.33170 15.742 < 2e-16 ***
## SM_other 67.93092 8.21236 8.272 < 2e-16 ***
## SM_autoren 18.81734 14.03485 1.341 0.18001
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 93.94 on 42684 degrees of freedom
## Multiple R-squared: 0.1648, Adjusted R-squared: 0.1646
## F-statistic: 647.9 on 13 and 42684 DF, p-value: < 2.2e-16
Data=list(y=y,X=X8)
R = 1000
keep = 20
k = 16 ## Number of independent parameters (beta0,beta1)
betabar=rep(0,k) ## Zero vector of dimensions equal to the number of independent
A=diag(rep(0.01,k)) ## Diagonal matrix of dimension k=2. The diagonal elements are 0.1.
nu =3
ssq = 1
Prior<-list(betabar=betabar,A=A,nu=nu,ssq=ssq)
nprint = 0
out_model_X8 = runireg(Data,Mcmc=list(R=R))
##
## Starting IID Sampler for Univariate Regression Model
## with 42698 observations
##
## Prior Parms:
## betabar
## [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## A
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
## [1,] 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
## [2,] 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
## [3,] 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
## [4,] 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
## [5,] 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
## [6,] 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00
## [7,] 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00
## [8,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00
## [9,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00
## [10,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00
## [11,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00
## [12,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00
## [13,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01
## [14,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
## [,14]
## [1,] 0.00
## [2,] 0.00
## [3,] 0.00
## [4,] 0.00
## [5,] 0.00
## [6,] 0.00
## [7,] 0.00
## [8,] 0.00
## [9,] 0.00
## [10,] 0.00
## [11,] 0.00
## [12,] 0.00
## [13,] 0.00
## [14,] 0.01
## nu = 3 ssq= 10563.21
##
## MCMC parms:
## R= 1000 keep= 1 nprint= 100
##
## MCMC Iteration (est time to end - min)
## 100 (0.1)
## 200 (0.1)
## 300 (0.1)
## 400 (0.1)
## 500 (0.1)
## 600 (0.1)
## 700 (0.0)
## 800 (0.0)
## 900 (0.0)
## 1000 (0.0)
## Total Time Elapsed: 0.13
model_X8_betadraw <- as.matrix(summary(out_model_X8$betadraw))
## Summary of Posterior Marginal Distributions
## Moments
## mean std dev num se rel eff sam size
## 1 -224.18 178.691 5.4238 0.83 900
## 2 -0.54 0.123 0.0037 0.82 900
## 3 0.14 0.089 0.0027 0.83 900
## 4 71.62 2.164 0.0800 1.23 450
## 5 -12.05 2.473 0.0824 1.00 900
## 6 46.86 4.472 0.1701 1.30 450
## 7 36.74 3.748 0.1029 0.68 900
## 8 141.36 7.062 0.2695 1.31 450
## 9 8.01 3.151 0.1057 1.01 450
## 10 -44.92 12.986 0.4958 1.31 450
## 11 114.11 2.198 0.0776 1.12 450
## 12 52.37 3.263 0.1270 1.36 450
## 13 67.69 7.817 0.2621 1.01 450
## 14 18.11 13.687 0.3928 0.74 900
##
## Quantiles
## 2.5% 5% 50% 95% 97.5%
## 1 -580.61 -5.2e+02 -220.21 65.32 100.05
## 2 -0.77 -7.4e-01 -0.55 -0.34 -0.31
## 3 -0.02 -2.6e-03 0.14 0.29 0.32
## 4 67.63 6.8e+01 71.54 75.20 75.88
## 5 -17.15 -1.6e+01 -11.95 -8.09 -7.27
## 6 38.08 3.9e+01 46.89 54.56 55.69
## 7 29.64 3.1e+01 36.77 42.72 44.12
## 8 127.46 1.3e+02 141.32 152.75 155.18
## 9 1.75 2.9e+00 7.96 13.20 14.33
## 10 -70.75 -6.6e+01 -45.07 -24.00 -19.42
## 11 109.98 1.1e+02 114.02 117.64 118.50
## 12 45.96 4.7e+01 52.35 57.88 58.72
## 13 52.59 5.5e+01 67.56 80.50 82.83
## 14 -9.70 -3.9e+00 18.33 40.12 43.89
## based on 900 valid draws (burn-in=100)
rownames(model_X8_betadraw) <- c("Intercept", "Gift_Month", "Gift_Year", "GS_radio", "GS_acquisition", "GS_otwhite", "GS_specialopportunity", "GS_showcase", "GS_passport", "GS_annfund", "GS_tv", "GS_web", "SM_other" , "SM_autoren")
model_X8_betadraw
## mean std dev num se rel eff
## Intercept -224.1803903 178.69060533 5.423823781 0.8291827
## Gift_Month -0.5441205 0.12333017 0.003730907 0.8236309
## Gift_Year 0.1406882 0.08877219 0.002690285 0.8265805
## GS_radio 71.6164965 2.16360944 0.080001800 1.2305076
## GS_acquisition -12.0479376 2.47302878 0.082407349 0.9993464
## GS_otwhite 46.8605679 4.47165946 0.170114344 1.3025276
## GS_specialopportunity 36.7350815 3.74757247 0.102891131 0.6784195
## GS_showcase 141.3585348 7.06197222 0.269490486 1.3106218
## GS_passport 8.0109448 3.15141257 0.105708341 1.0126293
## GS_annfund -44.9224475 12.98552738 0.495798001 1.3119971
## GS_tv 114.1055980 2.19820933 0.077571197 1.1207414
## GS_web 52.3689679 3.26328408 0.126981795 1.3627484
## SM_other 67.6852583 7.81668650 0.262070598 1.0116580
## SM_autoren 18.1068747 13.68674701 0.392788023 0.7412394
## sam size
## Intercept 900
## Gift_Month 900
## Gift_Year 900
## GS_radio 450
## GS_acquisition 900
## GS_otwhite 450
## GS_specialopportunity 900
## GS_showcase 450
## GS_passport 450
## GS_annfund 450
## GS_tv 450
## GS_web 450
## SM_other 450
## SM_autoren 900
## plotting examples
plot(out_model_X8$betadraw)
plot(out_model_X8$sigmasqdraw,type="l")
## Warning in plot.window(xlim, ylim, "", ...): graphical parameter "type" is
## obsolete
## Warning in title(main = main, sub = sub, xlab = xlab, ylab = ylab, ...):
## graphical parameter "type" is obsolete
## Warning in axis(1, ...): graphical parameter "type" is obsolete
## Warning in axis(2, ...): graphical parameter "type" is obsolete
par(mfrow = c(1,2))
draw=cbind(out_model_X8$betadraw,out_model_X8$sigsqdraw)
matplot(draw,type="l",col=c(1:4))
for (b in 1:k){
abline(betabar[b],0,col=b)
}
# histogram
par(mfrow = c(1,2))
hist(out_model_X8$betadraw, breaks = 30,
main = "All Gift source, Yr and Months",
yaxt = "n", yaxs="i",
xlab = "Posterior Dist. of beta", ylab = "", col = "dodgerblue4", border = "gray")
In model8, we see that the most of the Original Gift source radio, otwhite, special opportunity, showcase, passport, tv and web, Solicitation Method of autoren,and other has a positive impact on the first time donors in Ohio but outside of the city of Columbus in both OLS and Bayesian regression. We are assuming the prior to be non-informative. Their estimate of posterior parameters is converging to OLS which means the likelihood clears out prior. The mean of the coefficient estimates lies within the credible interval of 2.5% and 97.5% and is significant as is in OLS. We also see that the credible range of possible values of model parameters is very small representing greater confidence in the model parameters.
Lower autocorelation shows that it is more efficient in estimating the posterior with accuracy. When we look at the mixing properties of the identfied and unidentified parameters, it is clear that even though sigma(unidentified parameters) donot seem to mix,the variance is very high, beta is mixing relatively well.
Based on parameter estimates, Individuals outside of Columbus has a higher chance of being influenced to be first time donors by Gift source as tv, web, radio and special opportunity.
z <- GOrg_root_03 %>% select(Gift_Month, Gift_Year, Original_Gift_Source, Solicitation_Method, Original_Gift_Amount, Solicitation_Type )
#add dummy variables for Original Gift Source
z_01 <- z %>% mutate(GS_radio = ifelse(Original_Gift_Source == "radio", 1, 0))
z_02 <- z_01 %>% mutate(GS_acquisition = ifelse(Original_Gift_Source == "acquisition", 1, 0))
z_03 <- z_02 %>% mutate(GS_mail = ifelse(Original_Gift_Source == "mail", 1, 0))
z_04 <- z_03 %>% mutate(GS_otwhite = ifelse(Original_Gift_Source == "otwhite", 1, 0))
z_05 <- z_04 %>% mutate(GS_specialopportunity = ifelse(Original_Gift_Source == "special opportunity", 1, 0))
z_06 <- z_05 %>% mutate(GS_yearend = ifelse(Original_Gift_Source == "yearend", 1, 0))
z_07 <- z_06 %>% mutate(GS_newyear = ifelse(Original_Gift_Source == "newyear", 1, 0))
z_08 <- z_07 %>% mutate(GS_showcase = ifelse(Original_Gift_Source == "showcase", 1, 0))
z_09 <- z_08 %>% mutate(GS_passport = ifelse(Original_Gift_Source == "passport", 1, 0))
z_10 <- z_09 %>% mutate(GS_annfund = ifelse(Original_Gift_Source == "annual fund", 1, 0))
z_11 <- z_10 %>% mutate(GS_tv = ifelse(Original_Gift_Source == "tv", 1, 0))
z_12 <- z_11 %>% mutate(GS_web = ifelse(Original_Gift_Source == "web", 1, 0))
# add dummy variables for Solicitation method
z_13 <- z_12 %>% mutate(SM_onair = ifelse(Solicitation_Method == "On Air", 1, 0))
z_14 <- z_13 %>% mutate(SM_autoren = ifelse(Solicitation_Method == "Auto Renewal", 1, 0))
z_15 <- z_14 %>% mutate(SM_other = ifelse(Solicitation_Method == "Other", 1, 0))
z_16 <- z_15 %>% mutate(SM_web = ifelse(Solicitation_Method == "Web", 1, 0))
z_17 <- z_16 %>% mutate(SM_directmail = ifelse(Solicitation_Method == "Direct Mail", 1, 0))
z_18 <- z_17 %>% mutate(SM_email = ifelse(Solicitation_Method == "Email", 1, 0))
z_19 <- z_18 %>% mutate(SM_perscont = ifelse(Solicitation_Method == "Personal Contact", 1, 0))
z_20 <- z_19 %>% mutate(SM_online = ifelse(Solicitation_Method == "Online", 1, 0))
z_21 <- z_20 %>% mutate(SM_telemkt = ifelse(Solicitation_Method == "Telemarketing", 1, 0))
z_finalOH <- z_21
Regression model for Original gift source - first time Organizational donors
#Model_Z10 - include all-y_finalOH
model_Z10 <- lm(Original_Gift_Amount ~ Gift_Month + Gift_Year + GS_radio + GS_acquisition + GS_mail + GS_otwhite + GS_specialopportunity + GS_showcase + GS_passport + GS_annfund + GS_tv + GS_web + GS_yearend + GS_newyear , data = z_finalOH)
model_Z11 <- lm(Original_Gift_Amount ~ Gift_Month + Gift_Year + + SM_onair + SM_autoren + SM_other + SM_web + SM_directmail + SM_email + SM_perscont + SM_online + SM_telemkt , data = z_finalOH)
n = 91
y <- as.numeric(z_finalOH$Original_Gift_Amount)
Z10 <- cbind(rep(1,n), z_finalOH$Gift_Month,z_finalOH$Gift_Year,
z_finalOH$GS_radio, z_finalOH$GS_acquisition, z_finalOH$GS_mail, z_finalOH$GS_otwhite, z_finalOH$GS_specialopportunity, z_finalOH$GS_showcase, z_finalOH$GS_passport, z_finalOH$GS_annfund, z_finalOH$GS_tv,
z_finalOH$GS_web, z_finalOH$GS_yearend, z_finalOH$GS_newyear )
Z11 <- cbind(rep(1,n), z_finalOH$Gift_Month,z_finalOH$Gift_Year,
z_finalOH$SM_onair, z_finalOH$SM_autoren, z_finalOH$SM_other,
z_finalOH$SM_web, z_finalOH$SM_directmail, z_finalOH$SM_email,
z_finalOH$SM_perscont, z_finalOH$SM_online, z_finalOH$SM_telemkt)
Regression
summary(model_Z10)
##
## Call:
## lm(formula = Original_Gift_Amount ~ Gift_Month + Gift_Year +
## GS_radio + GS_acquisition + GS_mail + GS_otwhite + GS_specialopportunity +
## GS_showcase + GS_passport + GS_annfund + GS_tv + GS_web +
## GS_yearend + GS_newyear, data = z_finalOH)
##
## Residuals:
## Min 1Q Median 3Q Max
## -340.3 -163.9 -51.6 8.2 4570.8
##
## Coefficients: (6 not defined because of singularities)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -27259.600 24647.914 -1.106 0.272
## Gift_Month 9.795 17.174 0.570 0.570
## Gift_Year 13.533 12.234 1.106 0.272
## GS_radio 311.540 308.552 1.010 0.316
## GS_acquisition 161.212 410.801 0.392 0.696
## GS_mail NA NA NA NA
## GS_otwhite 238.814 319.915 0.746 0.458
## GS_specialopportunity 116.659 317.292 0.368 0.714
## GS_showcase NA NA NA NA
## GS_passport NA NA NA NA
## GS_annfund 182.789 488.892 0.374 0.709
## GS_tv 276.899 373.351 0.742 0.460
## GS_web NA NA NA NA
## GS_yearend NA NA NA NA
## GS_newyear NA NA NA NA
##
## Residual standard error: 549.9 on 82 degrees of freedom
## Multiple R-squared: 0.04334, Adjusted R-squared: -0.05
## F-statistic: 0.4643 on 8 and 82 DF, p-value: 0.8778
summary(model_Z11)
##
## Call:
## lm(formula = Original_Gift_Amount ~ Gift_Month + Gift_Year +
## +SM_onair + SM_autoren + SM_other + SM_web + SM_directmail +
## SM_email + SM_perscont + SM_online + SM_telemkt, data = z_finalOH)
##
## Residuals:
## Min 1Q Median 3Q Max
## -341.8 -166.7 -64.3 -0.5 4569.3
##
## Coefficients: (6 not defined because of singularities)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -25665.90 23770.95 -1.080 0.283
## Gift_Month 12.63 16.45 0.768 0.445
## Gift_Year 12.82 11.85 1.082 0.282
## SM_onair 127.18 120.35 1.057 0.294
## SM_autoren NA NA NA NA
## SM_other -17.75 325.92 -0.054 0.957
## SM_web -167.40 300.93 -0.556 0.579
## SM_directmail NA NA NA NA
## SM_email NA NA NA NA
## SM_perscont NA NA NA NA
## SM_online NA NA NA NA
## SM_telemkt NA NA NA NA
##
## Residual standard error: 541.6 on 85 degrees of freedom
## Multiple R-squared: 0.03795, Adjusted R-squared: -0.01864
## F-statistic: 0.6706 on 5 and 85 DF, p-value: 0.6468
Bayesian
Data=list(y=y,X=Z10)
R = 1000
keep = 20
k = 2 ## Number of independent parameters (beta0,beta1)
betabar=rep(0,k) ## Zero vector of dimensions equal to the number of independent
A=diag(rep(0.01,k)) ## Diagonal matrix of dimension k=2. The diagonal elements are 0.1.
nu =3
ssq = 1
Prior<-list(betabar=betabar,A=A,nu=nu,ssq=ssq)
nprint = 0
out_model_Z10 = runireg(Data,Mcmc=list(R=R))
##
## Starting IID Sampler for Univariate Regression Model
## with 91 observations
##
## Prior Parms:
## betabar
## [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## A
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
## [1,] 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
## [2,] 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
## [3,] 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
## [4,] 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
## [5,] 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
## [6,] 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00
## [7,] 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00
## [8,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00
## [9,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00
## [10,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00
## [11,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00
## [12,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00
## [13,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01
## [14,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
## [15,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
## [,14] [,15]
## [1,] 0.00 0.00
## [2,] 0.00 0.00
## [3,] 0.00 0.00
## [4,] 0.00 0.00
## [5,] 0.00 0.00
## [6,] 0.00 0.00
## [7,] 0.00 0.00
## [8,] 0.00 0.00
## [9,] 0.00 0.00
## [10,] 0.00 0.00
## [11,] 0.00 0.00
## [12,] 0.00 0.00
## [13,] 0.00 0.00
## [14,] 0.01 0.00
## [15,] 0.00 0.01
## nu = 3 ssq= 288012.4
##
## MCMC parms:
## R= 1000 keep= 1 nprint= 100
##
## MCMC Iteration (est time to end - min)
## 100 (0.0)
## 200 (0.0)
## 300 (0.0)
## 400 (0.0)
## 500 (0.0)
## 600 (0.0)
## 700 (0.0)
## 800 (0.0)
## 900 (0.0)
## 1000 (0.0)
## Total Time Elapsed: 0.00
out_model_Z10_beta <- as.matrix(summary(out_model_Z10$betadraw))
## Summary of Posterior Marginal Distributions
## Moments
## mean std dev num se rel eff sam size
## 1 -1403.3 5000.9 187.22 1.26 450
## 2 13.0 16.4 0.45 0.67 900
## 3 0.8 2.7 0.10 1.27 450
## 4 4.8 2057.7 66.59 0.94 900
## 5 -172.8 2076.5 67.99 0.96 900
## 6 -238.3 5332.2 171.64 0.93 900
## 7 -53.2 2054.6 67.30 0.97 900
## 8 -168.8 2060.7 66.67 0.94 900
## 9 -257.7 5391.3 165.57 0.85 900
## 10 -119.5 5228.9 200.48 1.32 450
## 11 -122.3 2052.9 65.97 0.93 900
## 12 -36.4 2069.2 65.89 0.91 900
## 13 -187.3 2055.5 65.92 0.93 900
## 14 45.3 5333.7 183.06 1.06 450
## 15 471.8 5154.6 152.50 0.79 900
##
## Quantiles
## 2.5% 5% 50% 95% 97.5%
## 1 -11153.8 -10082.5 -1219.31 6792.4 8141
## 2 -18.8 -14.1 12.71 39.0 46
## 3 -4.7 -3.6 0.74 5.4 6
## 4 -3939.4 -3273.3 -107.64 3294.1 4068
## 5 -4211.7 -3386.1 -242.42 3149.3 4024
## 6 -10773.4 -8970.1 -87.09 8347.4 10162
## 7 -3895.0 -3315.7 -142.97 3317.2 3933
## 8 -4224.7 -3441.1 -263.90 3093.7 3862
## 9 -10938.5 -9521.3 -248.57 8452.2 10429
## 10 -10794.5 -8631.7 -6.41 8184.0 9908
## 11 -4006.5 -3436.0 -221.94 3338.8 3909
## 12 -4049.6 -3374.5 -112.23 3283.3 4011
## 13 -4202.0 -3453.7 -236.88 3273.6 3890
## 14 -10383.0 -8474.7 90.11 8766.8 10805
## 15 -9773.6 -7983.0 702.92 8508.9 9543
## based on 900 valid draws (burn-in=100)
rownames(out_model_Z10_beta) <- c("Intercept", "Gift_Month", "Gift_Year", "GS_radio", "GS_acquisition", "GS_mail", "GS_otwhite", "GS_specialopportunity", "GS_showcase", "GS_passport", "GS_annfund", "GS_tv", "GS_web", "GS_yearend", "GS_newyear" )
out_model_Z10_beta
## mean std dev num se rel eff
## Intercept -1403.2618695 5000.937339 187.2182032 1.2613506
## Gift_Month 12.9585305 16.430178 0.4466675 0.6651608
## Gift_Year 0.7984551 2.730354 0.1026452 1.2719852
## GS_radio 4.8316744 2057.680129 66.5946800 0.9426834
## GS_acquisition -172.7984835 2076.532882 67.9926291 0.9649139
## GS_mail -238.2522536 5332.223985 171.6362961 0.9324896
## GS_otwhite -53.2355515 2054.595481 67.2951469 0.9655113
## GS_specialopportunity -168.7703580 2060.681398 66.6692988 0.9420470
## GS_showcase -257.6591275 5391.329325 165.5682333 0.8487989
## GS_passport -119.5470226 5228.870227 200.4756586 1.3229703
## GS_annfund -122.3053345 2052.888697 65.9738284 0.9295122
## GS_tv -36.3538576 2069.150928 65.8865211 0.9125388
## GS_web -187.3008004 2055.511244 65.9162641 0.9255246
## GS_yearend 45.2887931 5333.707628 183.0633134 1.0601974
## GS_newyear 471.8220441 5154.625535 152.4994657 0.7877437
## sam size
## Intercept 450
## Gift_Month 900
## Gift_Year 450
## GS_radio 900
## GS_acquisition 900
## GS_mail 900
## GS_otwhite 900
## GS_specialopportunity 900
## GS_showcase 900
## GS_passport 450
## GS_annfund 900
## GS_tv 900
## GS_web 900
## GS_yearend 450
## GS_newyear 900
## plotting examples
plot(out_model_Z10$betadraw)
par(mfrow = c(1,2))
draw=cbind(out_model_Z10$betadraw,out_model_Z10$sigsqdraw)
matplot(draw,type="l",col=c(1:4))
for (b in 1:k){
abline(betabar[b],0,col=b)
}
# histogram
par(mfrow = c(1,2))
hist(out_model_Z10$betadraw, breaks = 30,
main = "Beta for Organizations",
yaxt = "n", yaxs="i",
xlab = "Posterior Dist. of Beta", ylab = "", col = "dodgerblue4", border = "gray")
#########
Data=list(y=y,X=Z11)
out_model_Z11 = runireg(Data,Mcmc=list(R=R))
##
## Starting IID Sampler for Univariate Regression Model
## with 91 observations
##
## Prior Parms:
## betabar
## [1] 0 0 0 0 0 0 0 0 0 0 0 0
## A
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
## [1,] 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
## [2,] 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
## [3,] 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
## [4,] 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
## [5,] 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00
## [6,] 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00
## [7,] 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00
## [8,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00
## [9,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00
## [10,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00
## [11,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00
## [12,] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01
## nu = 3 ssq= 288012.4
##
## MCMC parms:
## R= 1000 keep= 1 nprint= 100
##
## MCMC Iteration (est time to end - min)
## 100 (0.0)
## 200 (0.0)
## 300 (0.0)
## 400 (0.0)
## 500 (0.0)
## 600 (0.0)
## 700 (0.0)
## 800 (0.0)
## 900 (0.0)
## 1000 (0.0)
## Total Time Elapsed: 0.00
out_model_Z11_beta <- as.matrix(summary(out_model_Z11$betadraw))
## Summary of Posterior Marginal Distributions
## Moments
## mean std dev num se rel eff sam size
## 1 -1076.22 5292.5 181.732 1.06 450
## 2 16.66 15.2 0.523 1.06 450
## 3 0.76 2.9 0.085 0.78 900
## 4 -270.94 2596.7 59.246 0.47 900
## 5 15.23 5162.5 168.258 0.96 900
## 6 -408.86 2610.8 56.796 0.43 900
## 7 -460.61 2606.5 59.964 0.48 900
## 8 -381.23 2591.6 58.966 0.47 900
## 9 -13.36 5422.7 176.788 0.96 900
## 10 -140.37 5195.8 163.561 0.89 900
## 11 -9.84 5546.4 156.144 0.71 900
## 12 -48.69 5447.3 184.455 1.03 450
##
## Quantiles
## 2.5% 5% 50% 95% 97.5%
## 1 -11954.9 -9956.8 -914.57 7706.1 9238.7
## 2 -12.4 -8.8 16.86 41.3 46.2
## 3 -4.8 -3.8 0.65 5.9 6.5
## 4 -5393.7 -4557.7 -332.76 4089.9 4514.2
## 5 -10420.6 -8459.0 -5.78 8594.8 10452.7
## 6 -5490.9 -4668.7 -373.61 3846.2 4569.8
## 7 -5450.9 -4762.6 -476.87 3833.1 4530.6
## 8 -5389.0 -4730.1 -447.38 3913.9 4534.3
## 9 -10217.9 -8782.1 -198.48 8873.6 10413.1
## 10 -10467.3 -8284.1 -256.30 8487.1 9812.4
## 11 -10366.0 -8958.6 9.40 9796.9 11227.9
## 12 -10099.5 -8609.8 106.76 8706.0 10518.3
## based on 900 valid draws (burn-in=100)
rownames(out_model_Z11_beta) <- c("Intercept", "Gift_Month", "Gift_Year", "SM_onair" , "SM_autoren" , "SM_other" , "SM_web" , "SM_directmail" , "SM_email" , "SM_perscont" , "SM_online" , "SM_telemkt" )
out_model_Z11_beta
## mean std dev num se rel eff sam size
## Intercept -1076.2160043 5292.462192 181.73237527 1.0611862 450
## Gift_Month 16.6649409 15.216137 0.52337417 1.0647761 450
## Gift_Year 0.7563058 2.893406 0.08495322 0.7758601 900
## SM_onair -270.9353887 2596.711428 59.24551523 0.4684962 900
## SM_autoren 15.2343952 5162.459139 168.25768604 0.9560466 900
## SM_other -408.8614712 2610.786234 56.79641964 0.4259335 900
## SM_web -460.6106318 2606.476409 59.96407727 0.4763402 900
## SM_directmail -381.2295296 2591.554524 58.96619742 0.4659379 900
## SM_email -13.3562943 5422.713730 176.78765624 0.9565618 900
## SM_perscont -140.3731282 5195.836837 163.56098575 0.8918482 900
## SM_online -9.8372508 5546.392461 156.14423204 0.7133016 900
## SM_telemkt -48.6880789 5447.301375 184.45526110 1.0319575 450
## plotting examples
plot(out_model_Z11$betadraw)
par(mfrow = c(1,2))
draw=cbind(out_model_Z11$betadraw,out_model_Z11$sigsqdraw)
matplot(draw,type="l",col=c(1:4))
for (b in 1:k){
abline(betabar[b],0,col=b)
}
# histogram
par(mfrow = c(1,2))
hist(out_model_Z11$betadraw, breaks = 30,
main = "Beta for Organizations",
yaxt = "n", yaxs="i",
xlab = "Posterior Dist. of Beta", ylab = "", col = "dodgerblue4", border = "gray")
Based on the data we have , the Organizations seems to have no influence by any of the Solicitation methods or on any Original Gift Source on first time donation.I also tried step-wise regression but the results remained unchanged. Credible interval of the independent variables in Bayes is very large, indicating very less confidence in model parameters. The mean estimates of the posterior also do not lie in the credible interval, and thus is similar to OLS regarding the insignificance of the independent variables.
Notes: Bayesian reasoning is a natural extension of our intuition. The aim of Bayesian Linear regression is not to find the single “best” value of the model parameters, but rather determine the posterior distribution for model parameters based on data and prior. This allows us to quantify our uncertainity about the model parameters.